In recent years use of SURFconext has increased substantially. More and more institutions are making use of SURFconext and the number of cloud services is also increasing. These include timetabling applications, Google Apps for Education, Blackboard and student portals. The institutions are becoming more and more dependent on SURFconext, so system failures would have a significant impact. For this reason, SURFnet is constantly working on greater availability and reliability of SURFconext.
The SURFconext platform is run in two physically separate data centres: Nikhef and Vancis. If one of the two data centres or their hardware breaks down, then the other location will automatically take over. Furthermore, the network connections to the data centres are redundant.
The SURFconext applications (SAML login, API groups, SURFconext Teams, SURFconext Dashboard, SURFconext Strong Authentication) are run on a total of 8 application servers (4 sets of 2). Two database servers and two load balancers are also available. Three of the application servers are used to process all logins. Which application server the user will access will be determined by the load balancer. There will always be one load balancer and one database server on stand-by. If one of the load balancers or databases fails, then the stand-by will automatically take over.
The fourth application server will be used for longer-term testing of new software releases in the production environment. This environment will also be used by SURFnet itself, and in exceptional cases – such as testing special cloud services or new user interfaces – it is also possible to allow others to log in via this environment. The management and statistics software are run on separate servers. These applications will configure and closely monitor the platform.
Further development of SURFconext
The SURFconext software is being continually developed. We recently launched strong authentication and at the beginning of October, various SURFconext components will be given a brand new look. In addition, we are working on new functionalities such as guest usage and rich authorisation. The SURFconext team also spends a great deal of time making the SURFconext software even more robust. This includes improvement of monitoring and logging, modular set-up of software, removal of unwanted components and implementation of the latest standards.
The SURFconext team strives to optimise the software and to release new functionalities quickly and in small parts. This makes it easier to maintain an effective overview of the test process and releases. Even for relatively small releases, it takes a number of weeks before the software is available for production. The developers develop their software in a special virtual machine on which OpenConext (the Open Source software behind SURFconext) has been installed. A number of tests are conducted here automatically. Once the developers are ready, they install the software in the testing environment. The SURFconext team will then test the software in this environment and if everything works properly, they will install the software in the acceptance environment. This is done in the exact same way as for installation in the production environment. To prevent errors, the installation of the software is conducted automatically to the greatest possible extent. We also always test the roll-back of the software in the acceptance environment. Finally, they install the software on a separate server in the production environment. There, the software will be used by the SURFconext team for a short period.
The installation of the new software will be done during the maintenance window (Tuesday or Thursday between 05:00 and 07:00). In most cases, this is done without the service going offline. Releases will always be announced to our SURFconext contact persons five days beforehand to ensure they know in advance.
In the first half of 2016, the SURFconext team will predominantly focus on the reliability of our platform. In addition to the two locations in Amsterdam, we will also use a third location. This will improve the availability and recovery time in the event of a major failure. The modular set-up that we strive for when developing our software will be further implemented into the infrastructure. This will make it easier to take specific measures for various components in line with the desired availability of the components.
Do you have any questions about the infrastructure, applications, release management or the roadmap? If so, please contact email@example.com