NEM node stuck on block height

Hi All.

We have setup nem node according to these instructions.

NEM Exchange Integration Guide

The node runs through the downloaded database then shoots up the CPU to 100% for about 3 hours before it calms down but then the Height does not increase and we get the following error in the logs.

Have tried rebooting and deleting the database to start the sync from scratch.

2018-12-04 19:10:33.053 INFO block height: 1922333 (org.nem.nis.service.PushService m)
2018-12-04 19:10:33.054 INFO isLastBlockParent? false; last block height: 1680776; hash: d46de247a7191e93c8ce55c32d919310711f58002ae881fa2aaf2a2e5eb4ed16 (org.nem.nis.BlockChain a)
2018-12-04 19:10:33.054 INFO Warning: ValidationResult=FAILURE_ENTITY_UNUSABLE_OUT_OF_SYNC (org.nem.nis.service.PushService a)

This is a none harvesting node setup. Any help appreciated.

1 Like

Hello, how your start script looks like (runNis.bat or nixRunNis.sh depends what system you use).
How much memory you have available and how much is allocated in start script?

Thank you

Hi. The server has 16GB and 10GB is allocated in the startup script.

java -Xms10240M -Xmx10240M -cp “.:package/nis:package/nis/:package/libs/” org.nem.deploy.CommonStarter

Including all the system processes there is still 2GB free.

KiB Mem : 16432380 total, 2424760 free, 11384328 used, 2623292 buff/cache
KiB Swap: 999420 total, 999420 free, 0 used. 4615788 avail Mem

Are you running the very latest version ?
Also make sure your system time isn’t out of whack.

Hi Memario

The time looks fine for my time zone.

@node2:~$ date
Wed Dec 5 12:23:46 SAST 2018

https://nemnodes.org/nodes/ is showing the version as 0.6.96

Because you allocated 10gb I would change garbage collector to G1.

java -Xms5G -Xmx10G -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:+PrintGC -Xloggc:"./gc.txt" -cp “.:package/nis:package/nis/ :package/libs/ ” org.nem.deploy.CommonStarter

Thanks pawelm. Will try that and let it rebuild the database. Will let you know.

1 Like

The outcome of the changes consumed a bit more memory and the node once again got stuck on a height of 1600775.

KiB Mem : 16432380 total,   493688 free, 14007048 used,  1931644 buff/cache
KiB Swap:   999420 total,   999308 free,      112 used.  2097852 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 1241 root      20   0 16.269g 0.013t  18916 S 398.3 83.8 418:13.58 java

The CPU fluctuated between 100% on one core and 400% on 4 cores. The java http service became unresponsive because of the high CPU usage (I assume) and could not respond to API calls. The logs giving this error.

2018-12-05 20:24:34.852 SEVERE Http Status Code 500: Internal Server Error (org.nem.core.connect.ErrorResponse <init>)

The log file you added to the execution string repeated this.

12306.073: [Full GC (Allocation Failure)  10217M->10215M(10G), 43.1955806 secs]
12349.269: [GC concurrent-mark-abort]
12349.282: [GC pause (G1 Evacuation Pause) (young)-- 10217M->10217M(10G), 0.0077040 secs]
12349.290: [GC pause (G1 Evacuation Pause) (young) (initial-mark) 10217M->10217M(10G), 0.0035179 secs]
12349.294: [GC concurrent-root-region-scan-start]
12349.294: [GC concurrent-root-region-scan-end, 0.0000414 secs]
12349.294: [GC concurrent-mark-start]

Thanks again for the help.

@BloodyRookie have you idea what’s wrong?

that only show that the node is not completely synced. Need a bigger part of the log.

Hi BloodyRookie.

Here are the logs files. Had to split them into two because of the file size restriction.

Please let me know if you require anything else.

nis-01.pdf (3.3 MB)
nis-02.pdf (2.8 MB)

The logs show that the node is still loading the chain.
The interesting part is when the node has finished loading the chain, when syncing takes place.

So are you saying that the node is still synchronizing and needs to finish? Its still on a height of 1680077 after 24 hours. Am I just being inpatient?

Last entry in the log is
2018-12-06 21:32:18.255 INFO loadBlocks (from height 1507202 to height 1507301) needed 52ms (org.nem.nis.dao.BlockDaoImpl d)
so it didn’t even finish initial loading of the blocks stored on the ssd. It might have finished by now, so can you supply the most recent log?

Hi All once again thanks for the support.

I started over and re-installed the node from scratch as per Paul’s steps in his guide NEM Exchange Integration Guide

Point 3.1 gives the link to a guide for setting up a super node. I stopped at instructions 1.4 and did not complete the servant and starter service installs.

I ran the modified nis start script in screen rather than using the starter service and the node synced fine.

1 Like