Skip to main content

Troubleshoot Orchestrate

The following are common issues and how to fix them. If your issue isn't addressed, contact support.

Transaction stuck in pending status

If your transaction is stuck in pending status, first check if it was mined. Fetch the transaction using its UUID, then query for its receipt using eth_getTransactionReceipt.

If there's no receipt, your transaction is still pending because one of the following:

  • The allocated gas is too low. In Orchestrate version 22.4.0 or higher, you can:

    • "Speed up" the transaction by sending the same transaction with a defined higher gas.
    • "Call off" the transaction by sending an empty data transaction with the same nonce and 10% more gas than the original transaction.
    caution

    Create a retry policy to automatically "speed up" stuck transactions in the future.

  • The nonce sequence is invalid. Orchestrate is responsible for nonce management, but if you use an account in Orchestrate outside of Orchestrate, or if different instances of Orchestrate are connected to different instances of Redis (where nonce sequence values are maintained), you might end up with gaps in the nonce sequence.

    To fix this, you can:

    • Wait for the nonce manager expiration time to pass. This is 5 minutes by default and can be set using the environment variable NONCE_MANAGER_EXPIRATION.

    • Manually clean up Redis using the command redis-cli FLUSHDB.

If there's a receipt, your transaction was mined. Check that the Tx-Listener is performing properly and the chains are synchronized, by retrieving your registered chain and checking that the value of listenerCurrentBlock is at least the number of the block that included your transaction.

If your chain synchronization is running behind, it might be a matter of time till the block including your transaction is retrieved, or Tx-Listener might be affected by a rate limitation on the connected RPC node.

Tx-Listener throttling RPC node

The Tx-Listener collects block information from the registered chains by making several calls to the node. These calls might exceed certain limits, especially when using blockchain service providers such as Infura. If you run into this issue, try the following:

  • If you have more than one chain registered using the same RPC node, use the Chain Proxy cache.
  • If you have only one chain registered, limit the number of parallel open connections to your node using the PROXY_MAXIDLECONNSPERHOST environment variable. This might introduce some additional latency on the time required to synchronize your chains.

Transaction signing failure

Orchestrate depends on Quorum Key Manager (QKM) to perform signing operations. If your transactions fail due to a failed to sign error, see the QKM logs for details.

The following are the most common HTTP errors and their possible causes:

  • Code 404: Orchestrate can't reach QKM or the store name isn't found. In this case, make sure your server is reachable from your local Orchestrate instance and the store is correctly loaded.
  • Code 424: QKM has dependency issues such as failing to connect to the Key Vault or Postgres.
  • Code 401: Orchestrate fails to authenticate in QKM.

Read more about configuring QKM.

Chain proxy failure

Before removing a chain in Orchestrate, check that there are no active jobs connected to the chain. If you remove a chain with active jobs, Orchestrate falls into an inconsistent state when services such as Tx-Sender and Tx-Listener can't complete the ongoing work and end up in a crashing loop.

If you reach this state, access the database and manually remove all active jobs belonging to the removed chain, using the following SQL query:

DELETE FROM jobs j1
USING jobs as j2
LEFT JOIN chains as c on (c.uuid = j2.chain_uuid)
WHERE j1.id = j2.id and c.uuid is null;

You can perform a full cleanup of historical data.

Clean up historical data

Most data in the Orchestrate Postgres database, such as jobs, schedules, and logs, aren't required for services to work, and remains available only as historical data.

If you experience some additional latency, you can run the following SQL queries to clean up your database:

  • Delete jobs in final status:

    DELETE FROM jobs
    WHERE status IN ('FAILED', 'STORED', 'MINED')
  • Delete transactions without jobs:

    DELETE FROM transactions t1
    USING transactions as t2
    LEFT JOIN jobs as j on (j.transaction_id = t2.id)
    WHERE t1.id = t2.id and j.id is null;
  • Delete logs without jobs:

    DELETE FROM logs l1
    USING logs as l2
    LEFT JOIN jobs as j on (j.id = l2.job_id)
    WHERE l1.id = l2.id and j.id is null;
  • Delete schedules without jobs:

    DELETE FROM schedules s1
    USING schedules as s2
    LEFT JOIN jobs as j on (j.schedule_id = s2.id)
    WHERE s1.id = s2.id and j.id is null;
  • Delete transaction requests without schedules:

    DELETE FROM transaction_requests tr
    USING transaction_requests as tr2
    LEFT JOIN schedules as s on (s.id = tr2.schedule_id)
    WHERE tr.id = tr2.id and s.id is null;