Status | |
Description |
Due to the Energy Transition, the use of power transmission and distribution grids is changing. The control architecture of power grids needs to be swiftly adapted to take account of infeed at lower grid levels, higher dynamics in flow patterns, and more distributed controls (both internal controls and grid flexibility services from third parties). In this context, TSOs and DSOs require a new generation of Digital Substation Automation Systems (DSAS) that could provide more complex, dynamic, and adaptative automation functions at grid nodes and edge, as well as enhanced orchestration from central systems, in both flexible and scalable manner. Virtualization is seen as a key innovation in order to fulfill these needs.
SEAPATH, Software Enabled Automation Platform and Artifacts (THerein), aims at developing a “reference design” and “industrial grade” open source real-time platform that can run virtualized automation and protection applications (for the power grid industry in the first place and potentially beyond). This platform is intended to host multi-provider applications.
Due to the nature of the virtualized applications, whose function is to regulate, control, command and transmit information relating to the operation, management and maintenance of an electrical substation, the virtualization base must meet the challenges of reliability, performance and availability.
The virtualisation platform uses the following open source tools:
The Yocto Project is a Linux Foundation collaborative open source project whose goal is to produce tools and processes that enable the creation of Linux distributions for embedded and IoT software that are independent of the underlying architecture of the embedded hardware.
The Yocto Project provides interoperable tools, metadata, and processes that enable the rapid, repeatable development of Linux-based embedded systems in which every aspect of the development process can be customized.
The Layer Model simultaneously supports collaboration and customization. Layers are repositories that contain related sets of instructions that tell the OpenEmbedded build system what to do. You can collaborate, share, and reuse layers.
Layers can contain changes to previous instructions or settings at any time. This powerful override capability is what allows you to customize previously supplied collaborative or community layers to suit your product requirements.
SEAPATH follows the applicable cybersecurity guidelines defined by the ANSSI in the DAT-NT-28/ANSSI/SDE/NP document . Several mechanisms have been taken into account in order to guarantee system’s security:
SEAPATH uses a full preemptable Linux Kernel which brings real-time features.
cyclictest tool is used to accurately and repeatedly measure the difference between a thread's intended wake-up time and the time at which it actually wakes up in order to provide statistics about the system's latencies. It can measure latencies in real-time systems caused by the hardware, the firmware, and the operating system. Note that the preemptable Linux Kernel is not enough to guarantee real-time performance of the guests. Real-time applications, network configuration and so on, must also be designed carefully.
The result and analysis of Real-time tests performed on a SEAPATH cluster can be found in section RT test results.
This section describes how to build and configure a minimal cluster to use the SEAPATH project. The following tools are used during the process:
As described above, the SEAPATH minimal cluster requires three machines, two cluster machines (or hypervisors) and an observer. The steps to build and flash the Yocto images for the cluster machines are described in https://github.com/seapath/yocto-bsp.
Once the machines have been flashed with the corresponding SEAPATH Yocto images, the cluster can be configured by using the Ansible tool. You can follow the procedure described on https://github.com/seapath/ansible to deploy the cluster or create your own Ansible playbooks.
As described on the previous link, the different configuration and setup tasks to configure the cluster have been gathered on a single playbook, so it is enough to execute:
ansible-playbook -i inventories/cluster_inventory.yaml playbooks/cluster_setup_main.yaml |
Attention: While executing this playbook it is necessary to remove the media (USB key) that contains the SEAPATH Yocto images once they have been flashed on the disk. Please remove your USB when the Ansible message "Wait for all cluster machines to be online" appears.
Once the cluster has been configured you are ready to deploy VMs on it. The cluster_vm Ansible module is present to manage virtual machines. It can be called from a playbook to perform actions on VMs. For instance, an example of playbook that creates a VM from a predefined image disk and XML Libvirt configuration would be:
- name: Create and start guest0 cluster_vm: name: guest0 command: create system_image: my_disk.qcow2 xml: "{{ lookup('file', 'my_vm_config.xml', errors='strict') }}" |
Playbooks can be executed in any hypervisor. Other playbook examples are stored in the example/playbooks/vm directory.
This section describes the VM architecture and the cluster_vm commands from a high-level point of view. Please read its documentation on Annex 1. cluster_vm module documentation for further information.
Note: Like other Ansible modules, the cluster_vm documentation can also be displayed by executing ansible-doc cluster_vm command from the Ansible root repository.
You will also find information on how to troubleshoot problems related to VM management on Annex 2. Troubleshooting VM management.
In the SEAPATH cluster the VMs can have several statuses:
The diagram below describes how a VM is stored in the SEAPATH cluster. All non-volatile VM data is stored using Ceph, which is in charge of the maintenance of the data-store and data replication between all the hypervisors.
Metadata provides information associated with a VM. It consists of a list of pairs (key, value) that are set at the moment of the VM creation. You can define as many metadata fields as you want but some keys are reserved:
KEY | VALUE MEANING |
vm_name | VM name |
_base_xml | Initial Libvirt XML VM configuration |
xml | Libvirt XML file used for the VM configuration. It is autogenerated by modifying the _base_xml file. |
The VM data disk is set when creating a new VM or cloning an existing one, as described in the schemas below.
Create a VM from scratch by importing an image disk with the create command:
- name: Create and start guest0 cluster_vm: name: guest0 command: create system_image: my_disk.qcow2 xml: "{{ lookup('file', 'my_vm_config.xml', errors='strict') }}" |
Copy an existing VM with the clone command:
- name: Clone guest0 into guest1 cluster_vm: name: guest1 src_name: guest0 command: clone |
The network configuration inside the VMs is done with the playbook file cluster_setup_network.yaml. You need to use an inventory that describes the VMs instead of the cluster as in the example vms_inventory_example.yaml file.
Disk snapshots can be used to save the disk image data at a given moment, that can be later recovered.
Snapshots can be created when the VM is stopped or running, but if you perform a snapshot when the VM is running, only the data written on the disk will be saved. Volatile data such as the content of the RAM or the data not written on the disk will not be stored on the snapshot.
- name: Create a snapshot of guest0 cluster_vm: name: guest0 command: create_snapshot snapshot_name: snap1 |
You can restore the VM to a determined previous state by performing a rollback operation based on a snapshot. The data saved during the snapshot operation will be restored and replace the current disk image data. All current disk image data will be lost. The rollback operation does not remove the snapshot, it is possible to reuse the snapshot to re-apply a later rollback.
The rollback operation must be applied on a disabled machine. So if the VM is enabled, it will be automatically disabled before the rollback and re-enabled once the operation is finished.
- name: Rollback VM guest0 to snap0 cluster_vm: name: guest0 command: rollback_snapshot snapshot_name: snap0 |
With the cluster_vm module it is also possible to:
An example playbook that removes the snapshots created before a determined date would be:
# Example - Remove old snapshots - name: Remove snapshots of guest0 older than January 24th 2021 8:00 AM cluster_vm: name: guest0 command: purge_image purge_date: date: '2021-01-24' time: '08:00' |
The purge operation can be performed regularly to avoid over space. This can be easily done with a tool like Ansible Tower or AWX.
Updating the VM data cannot be performed by the cluster_vm module, but you can use its snapshot system to cancel the update in case of error as described in the diagram below. To achieve this, you can base your playbook on the example found in the Ansible git repository: examples/playbooks/update_skeleton.yaml.
The VM configuration and metadata are immutable. To change them, you must create a new VM from the existing one with the clone command.
The file examples/playbooks/update_config_skeleton.yaml can help you to create a playbook to achieve this operation according to the following diagram.
A continuous integration (CI) process has been implemented in order to build and deploy a custom cluster and automatise the periodical validation of the development. Source code can be found in https://github.com/seapath/ci.
The CI is based on a Jenkins server that has been completely dockerized in order to guarantee its reproducibility and scalability.
Remote actions are achieved thanks to Ansible playbooks (https://github.com/seapath/ansible).
Jenkins is an open source automation server that permits the periodical build, test, deployment or other tasks that need to be automated such as code synchronisation.
The main CI job can be divided into different stages:
As shown on the diagram, each step of the chain is validated with the corresponding Cukinia tests and results collected on the CI Jenkins server.
Jenkins offers a web UI to configure, manage and follow the progress of its jobs. As shown in the following picture, the stage view shows job information such the progress of the pipeline execution or the trend of the test results for the consecutive job executions.
The results of the Cukinia tests run on the cluster are retrieved and displayed on the Jenkins UI.
The Blue Ocean plugin offers an intuitive UI that simplifies the view and edition of a pipeline. It also permits re-running stages for a determined build.
The complete output of a job execution can be obtained as well as the partial logs for each stage, as shown on the following picture.
In order to validate the real-time requirements of the SEAPATH project, the CI also permits to execute cyclic tests on the VMs and retrieve the results.
The following document describes the test cases which must meet the following objectives:
seapath.pptx
The documentation is available directly on github.
Here is the main documentation to build a SEAPATH image: https://github.com/seapath/yocto-bsp/blob/master/README.adoc
The Yocto project requires a powerful Linux based machine.
In order to build efficiently the SEAPATH project, we recommend not to use Virtual Machine. The Yocto project will ensure to multi-thread your build, so try to use a build machine with many CPU cores.
Here is a discussion on the Yocto Project mailing list: https://lists.yoctoproject.org/g/yocto/topic/72047879#48815
Here is for instance, a build configuration (~1500 euros) used:
CPU | AMD RYZEN 9 3900X WRAITH PRISM LED RGB (3.8 GHZ / 4.6 GHZ) |
Cooling | NOCTUA NH-U14S |
MotherBoard | ASUS PRIME X570-P |
Chipset | Intel C612 |
PowerSupply | SEASONIC PRIME ULTRA 650 W GOLD |
RAM | G.SKILL FLARE X SERIES 32 GO (2 X 16 GO) DDR4 3200 MHZ CL14 |
SSD (SATA) | SAMSUNG SSD 860 EVO 500 GO |
SSD (NVME) | CORSAIR FORCE MP600 1 TO |
GPU | ASUS RADEON R7 240 R7240-2GD3-L |
Case | PHANTEKS ENTHOO PRO |
rm -rf
to fail with an error. find . -delete
will work better, as it will not try to index all files before deleting them.Parts | Specifications |
---|---|
Motherboard | ASMB‐823 |
Chipset | Intel C612 |
CPU | XEON 2.4G 35M 2011P 14CORE E5‐2680V4 |
Memory | 2x 8G R‐DDR4‐2400 1.2V1GX8 HYX |
Disk | SQF 2.5 SATA SSD 830 512G MLC (‐40~85°C) |
NIC | INTEL I210 NIC 10/100/1000M PCIEx4 2PORT(G) |
With the previous test bench hardware, a couple of tests were used.
We used cyclictest:
"Cyclictest accurately and repeatedly measures the difference between a thread's intended wake-up time and the time at which it actually wakes up in order to provide statistics about the system's latencies. It can measure latencies in real-time systems caused by the hardware, the firmware, and the operating system." (source: https://wiki.linuxfoundation.org/realtime/documentation/howto/tools/cyclictest/start).
The following arguments were provided:
cyclictest -l100000000 -m -Sp90 -i200 -h400 -q >output |
This test is very long (~5 hours).
You can then plot the latency graph:
./yocto-bsp/tools/gen_cyclic_test.sh -i output -n 28 -o output.png |
Note:
All Yocto images include the ability to run guest Virtual Machines (VMs).
We used KVM and Qemu to run them. As we do not have any window manager on the host system,
VMs should be launched in console mode and their console output must be correctly set.
For testing purpose, we can run our Yocto image as a guest machine.
We do not use the .wic image which includes the Linux Kernel and the rootfs because
we need to set the console output.
We use two distinct files to modify the Linux Kernel command line:
- bzImage: the Linux Kernel image
- seapath-test-image-votp.ext4: the rte rootfs
Then run:
qemu-system-x86_64 -accel kvm -kernel bzImage -m 4096 -hda seapath-test-image-votp.ext4 -nographic -append 'root=/dev/sda console=ttyS0' |
You can use docker check-config.sh to check that all necessary configurations of the host linux Kernel are set:
info: reading kernel config from /proc/config.gz ... Generally Necessary: - cgroup hierarchy: properly mounted [/sys/fs/cgroup] - CONFIG_NAMESPACES: enabled - CONFIG_NET_NS: enabled - CONFIG_PID_NS: enabled - CONFIG_IPC_NS: enabled - CONFIG_UTS_NS: enabled - CONFIG_CGROUPS: enabled - CONFIG_CGROUP_CPUACCT: enabled - CONFIG_CGROUP_DEVICE: enabled - CONFIG_CGROUP_FREEZER: enabled - CONFIG_CGROUP_SCHED: enabled - CONFIG_CPUSETS: enabled - CONFIG_MEMCG: enabled - CONFIG_KEYS: enabled - CONFIG_VETH: enabled - CONFIG_BRIDGE: enabled - CONFIG_BRIDGE_NETFILTER: enabled - CONFIG_NF_NAT_IPV4: enabled - CONFIG_IP_NF_FILTER: enabled - CONFIG_IP_NF_TARGET_MASQUERADE: enabled - CONFIG_NETFILTER_XT_MATCH_ADDRTYPE: enabled - CONFIG_NETFILTER_XT_MATCH_CONNTRACK: enabled - CONFIG_NETFILTER_XT_MATCH_IPVS: enabled - CONFIG_IP_NF_NAT: enabled - CONFIG_NF_NAT: enabled - CONFIG_NF_NAT_NEEDED: enabled - CONFIG_POSIX_MQUEUE: enabled Optional Features: - CONFIG_USER_NS: enabled - CONFIG_SECCOMP: enabled - CONFIG_CGROUP_PIDS: enabled - CONFIG_MEMCG_SWAP: enabled - CONFIG_MEMCG_SWAP_ENABLED: enabled (cgroup swap accounting is currently enabled) - CONFIG_LEGACY_VSYSCALL_EMULATE: enabled - CONFIG_BLK_CGROUP: enabled - CONFIG_BLK_DEV_THROTTLING: enabled - CONFIG_IOSCHED_CFQ: enabled - CONFIG_CFQ_GROUP_IOSCHED: enabled - CONFIG_CGROUP_PERF: enabled - CONFIG_CGROUP_HUGETLB: enabled - CONFIG_NET_CLS_CGROUP: enabled - CONFIG_CGROUP_NET_PRIO: enabled - CONFIG_CFS_BANDWIDTH: enabled - CONFIG_FAIR_GROUP_SCHED: enabled - CONFIG_RT_GROUP_SCHED: missing - CONFIG_IP_NF_TARGET_REDIRECT: enabled - CONFIG_IP_VS: enabled - CONFIG_IP_VS_NFCT: enabled - CONFIG_IP_VS_PROTO_TCP: enabled - CONFIG_IP_VS_PROTO_UDP: enabled - CONFIG_IP_VS_RR: enabled - CONFIG_EXT4_FS: enabled - CONFIG_EXT4_FS_POSIX_ACL: enabled - CONFIG_EXT4_FS_SECURITY: enabled - Network Drivers: - "overlay": - CONFIG_VXLAN: enabled Optional (for encrypted networks): - CONFIG_CRYPTO: enabled - CONFIG_CRYPTO_AEAD: enabled - CONFIG_CRYPTO_GCM: missing - CONFIG_CRYPTO_SEQIV: missing - CONFIG_CRYPTO_GHASH: missing - CONFIG_XFRM: enabled - CONFIG_XFRM_USER: enabled - CONFIG_XFRM_ALGO: enabled - CONFIG_INET_ESP: missing - CONFIG_INET_XFRM_MODE_TRANSPORT: missing - "ipvlan": - CONFIG_IPVLAN: enabled - "macvlan": - CONFIG_MACVLAN: enabled - CONFIG_DUMMY: missing - "ftp,tftp client in container": - CONFIG_NF_NAT_FTP: enabled - CONFIG_NF_CONNTRACK_FTP: enabled - CONFIG_NF_NAT_TFTP: missing - CONFIG_NF_CONNTRACK_TFTP: missing - Storage Drivers: - "aufs": - CONFIG_AUFS_FS: missing - "btrfs": - CONFIG_BTRFS_FS: missing - CONFIG_BTRFS_FS_POSIX_ACL: missing - "devicemapper": - CONFIG_BLK_DEV_DM: enabled - CONFIG_DM_THIN_PROVISIONING: missing - "overlay": - CONFIG_OVERLAY_FS: missing - "zfs": - /dev/zfs: missing - zfs command: missing - zpool command: missing Limits: - /proc/sys/kernel/keys/root_maxkeys: 1000000 |
More details can be found there.
the code of conduct is available here