I am posting this on behalf of @kaiyzen and the development team to give them space to continue resolving an annoying issue. The intention is to explain where 0.9.6.2 is up to, what is the reason it wasn’t released by the 30th June as communicated here and how we get to the release date.
In order the release, the following high level steps need to complete:
- 0.9.6.2 Core Code released: complete and released in public on 23rd June 2020, passed testing
- REST API code: released on time for testing
- SDK - TS/JS: released on time for testing, minor bug fix after
- SDK - Java: released 1 day late but testing caught up the time
- Testing of 2, 3 & 4 started on Mon 29th and completed on 30th June 2020
- End to end testing in an isolated environment using bootstrap completed 30th June
This means that the release passed all testing and all components were complete, ready for release on time. At this point, the release exists in the development/test environment and in a private branch(es) waiting for the final sign off to be released to the publicly visible Github with tags.
In parallel to the above steps a new Testnet was created, the bootstrap and testnet-bootstrap were updated and it was running
The only steps after this point is to release to Testnet and scale the chain with a few nodes, then a quick automation test, then signed off for release to the public github repo.
However, when the team began adding nodes to the Testnet, the nodes were unable to synchronise, meaning no new nodes could join it. They have been working on this issue most of last night and today, we have held off this update until we had a reasonable idea of progress to give as full a report as possible.
What has been found is:
- Running a fresh/clean testnet in the isolated testing environment doesn’t have this issue
- Copying data from the ‘public’ 0.9.6.2 testnet to the isolated one allows reproduction of the issue
There are two primary hypotheses just now - first there may be a bug and it only presents itself with the specific data on the chain that shows the issue, or second there is an issue with the Testnet configuration/build.
Next steps will continue tonight and tomorrow, we expect these to conclude at latest Monday at this stage, if they complete before then release will happen before that, either way it will be announced as soon as practical:
- Reset the Testnet
- Re-run the automation tests
- If the above passes as it is expected to, release 0.9.6.2
- Continue investigating the issue in the reproduction environment and issue a hot fix if necessary later.
In parallel the issue will be investigated to try and identify the root cause.
The reason this approach is being taken is that until we can state categorically what the issue is, there is a chance it could occur on other private chain deployments, if it is data related for example. However at this stage it appears that a chain either does or does not have the issue…i.e. it doesn’t develop it. Which means if the Testnet is working, apart from this issue, then it is ready for release and community use/testing.
We hope that providing the full situation and information is useful for people trying to understand why the release is late, unfortunately the issue was caught late, the teams are working very hard to track the issue down and we expect to have an update by Monday, but possibly sooner.