Symbol Launch - General Status Update (01-Feb-2021)

It is just an upgrade so no not necessarily, that will be driven by how testing goes and any issues, it is not a reset and Testnet should continue on regardless of the fork.

Having to perform another fork is also a useful process to (re)test prior launch, it will definitely happen on Mainnet and last time the execution caused some issues on Testnet and was learned from. So it is a useful side benefit that it will occur again on Testnet. Not why it will happen, that is a technical change to fix the issue, but a useful benefit never the less.

6 Likes

Sorry for much questions, but will ther be another update this week or not?

Curios about some things from the update.
There should be NEMTus testing this week but i don’t see an announcment from them, is this one a open or a closed testing? (nvm, already saw daoka announcing to test it today)
Also it looks like no one still have assigned those 2 issues found recently.

hi why it takes so slow? i mean why??? nextweek will be 2ndweek of feb if there’s still no snapshot date maybe you dont have a plan to launch it?, always next week nextweek,

Seriously ? @DaveH has continuously and repeatedly offered comprehensive updates on the steps being taken to make the launch a success. I suggest you read what he posts.

6 Likes

I saw Twitter posts to confirm that the new release has passed the “Daoka-Cannon” retest earlier today.

The issues have been submitted for testing already (we don’t actively use the assigned field in Github at present) it will take some time to run through the full testing etc

1 Like

Additional Update (04-Feb-2021)

An additional update on the above, doesn’t really warrant a full post:

  • Symbol 0.10.0.6 has now been released

  • 0.10.0.6 been applied to a majority of nodes on the Testnet, a few community ones still remain on the old version but will no doubt catch up over next few days

  • Two remaining issues (#151 and #152) are in testing/bug fix at present, will take a few days for results

  • NGL will perform INTERIM stress tests on the Testnet in the next 24-48 hours, to ensure the recent release still passes the tests it passed previously. This will be a 400tps test and will run for 12-24 hours, it is likely to start in the next few hours and you may notice Testnet is under heavy load as a result for the rest of the week

  • The final test will not be until after the next release (anticipated release will be early next week subject to testing over the weekend)

  • A Desktop & Mobile Wallet + CLI release is being prepared and undergoing final testing with a view to releasing tomorrow or Saturday

  • The team are also going to work on cleaning up the repos in terms of closing out issues, reviewing PRs and triaging problems/labelling, we expect to complete this next week to help make visibility easier. I will also communicate a test and triage approach so the community can see how things are working and why issues do/don’t get picked up in Github, will take me a few days to write up though

18 Likes

Thanks for clear update!

4 Likes

Thanks for the detailed update. Always appreciated. :slight_smile:

5 Likes

Update - 05-Feb-2021 approx midday UTC

Further to this update, the Interim Load Test was started approx midnight UTC last night.

It appears to have caused issued on various nodes on the Testnet. It was planned to run for 12-24 hours but was stopped early as a result.

If you look on this page, the height column should ideally show most nodes within 1-2 blocks of each other, but currently doesn’t: symbol node list (testnet)

The issue has affected the main NGL nodes which Wallet etc use as defaults, so you may find things like Wallets behave strangely while those nodes are resynchronised. The Faucet was affected and has been moved, it should be working again.

At this stage there is not an estimate or confirmed root cause until the logs etc are reviewed. The team will be looking into the issue and steps to bring Testnet back into consistency through today and as updates become available we will try and communicate them.

Slack conversation for anyone on Slack: https://nem2.slack.com/archives/C9YKR0EUX/p1612516491133100

4 Likes

Not only the network stopped, but also the phenomenon that the node does not recover.
Although it is not certain, nodes with higher specs tend to be working.

Some VPS are wiped out.
We estimate that there are about 60 nodes that are currently working.

2 Likes

On the other hand, are the surviving nodes useful?
Is there any information I can provide?

Thanks @vistar if you can join the conversation on slack above, it would be helpful to keep things together.

Probably useful would be things like:

  • Server spec
  • Operating system
  • Node type (peers, api, dual, voting etc)
  • If you are using bootstrap or not

But definitely easiest to have in one slack channel for people

1 Like

How does the daoka cannon retest passed where i assume there was same patch and 6k tps, but the current 400tps test coused the nodes to drop now?

I think the symbol launch is best on 3/29

Symbol can be twins with nis1

2 Likes

Different kind of test, the Daoka-Canon retest was checking a specific set of steps which are now working ok

1 Like

Update 05-Feb-2021 approx 17:30 UTC

Huge thanks to Wayon, Jag, Gimre and the rest of the team for taking the time to explain the below alongisde the investigation work, I’ve tried to summarise the current state as I understand it from what is happening:

The team are still looking at the issue, the summary of currently known information is below, it is ongoing and subject to change as more is known:

  • The issue affects api-broker, exactly why/how is being confirmed

  • The chain is still operating and Finality is still working, it can be checked on a known working API node: http://18.144.6.168:3000/chain/info

  • On affected nodes, the api-broker is down, so rest gateway is not aware of the current state of the chain (MongoDB isn’t updated when it is down), so REST reports what it knows about up until it went down rather than the actual current state of the chain on the peer node

  • The node list site (https://symbolnodes.org/nodes_testnet) relies on REST calls for chain height so may have issues reporting the actual height while the broker-node(s) are down

  • The auto-recovery issue that is present in Bootstrap (#108) means just restarting isn’t quite enough, that issue was already known and we knew needed to be addressed and has obviously now risen in priority

  • Resetting and resynchronising the node does appear to resolve the issue and bring it back online, this is the only concrete approach we know definitely fixes it, but we are still looking for other ones

The process appears conceptually to have been something like:

  1. Api-broker had issues on some nodes (root still being identified 100%),

  2. Api-broker failed due to the above and stopped

  3. Bootstrap Auto-recovery doesn’t allow it to restart and api-node ends in a state that cannot be easily recovered

  4. Peer node is still functioning normally.

  5. Issue only affects Dual or API nodes, it just happens that most nodes are dual nodes and most NGL nodes are dual and voting to simulate Mainnet in terms of SuperNodes

The work is going to continue today and over the weekend, we are likely to start resetting the NGL nodes in small batches soon and that will obviously take a day or two due to the number of nodes involved and not wishing to disrupt the chain or finality.

Edit: Just noticed a tweet from Jag so linking here as well: https://twitter.com/Jaguar0625/status/1357725263245762560

14 Likes

Hi @DaveH can update us on the state of the testnet right now? Looks like nodes are recovered.
About the issue that coused the most nodes to drop and if there is a solution to fix what happened.

Anything new about the github issues #151-152 and what’s are the next steps the team will take?

Testnet is now back consistent across almost all nodes

The issues are being investigated and checked still, I’ll update when we have more on them today/tomorrow

7 Likes

Sorry for the question. I’m new to the group. Is there a projection date as to when the snapshot will take place and when the equaling symbol tokens would show up in my mobile symbol wallet?

The snapshot date has not been announced, neither the launch date, but as for seeing the tokens in you mobile symbol wallet, they should be there after launch, presumably automatically if you have opted in with your mobile app.