Hyperion History API Solution
History API is arguably the most pressing issue on EOS Main Net for a few months now. dApps, block explorers, and wallets must consult historical information to work properly, and running a full history on EOS Main Net became expensive, complex and time-consuming.
V1 History API was deprecated and only a handful of BPs kept providing full public History nodes for the entire network, (kudos to @eos.sw-eden, @CryptoLions, @EOSTribe, @Greymass, @EOS-Canada, @EOS-Asia and @OracleChain) while many others are putting considerable efforts on different solutions to solve this problem.
Some argue that this is not a big issue because dApps can find a business model to pay for partial history nodes focused on their transactions, while block explorers and wallets could use light history. However, the prevalent perception among the community is that difficulties in providing historical chain data can hinder EOS capacity to meet scalability expectations, and this kind of prophecy is usually self-fulfilling.
EOS Blockchain contains approximately 46 million blocks at the time of writing (3/7/2019), so for a newcomer to start providing the service, nodes must ingest all those blocks and the 2 more blocks that are appended every second to the blockchain in a process that currently takes weeks. Once synced, the current v1 History Plugin takes more than 5 Tb of storage to run. Querying this database demands lots of processing power and network bandwidth. As a result, running a full history can cost more than USD 15k/month.
A New Perspective
A few months ago EOS Rio team started brainstorming on possible solutions for this issue. Instead of focusing on perceived bottlenecks to increase data ingestion, storage, and querying capabilities, we decided to start from scratch. The first step was to analyze what could be done to optimize database size itself. We learned that History API v1 stores a lot of redundant information.
The original history_plugin bundled with eosio, that provided the v1 API, stored inline action traces nested inside the root actions. This led to an excessive amount of data being stored and transferred whenever a user requested the action history for a given account. Also, inline actions are often used as an “event” mechanism to notify parties on a transaction, and there is little value in storing it.
Hyperion History implements a new approach to data structure and storage:
- actions are stored in a flattened format
- a parent field is added to the inline actions to point to the parent global sequence
- if the inline action data is identical to the parent, it is considered a notification and thus removed from the database
- no blocks or transaction data are stored, all information can be reconstructed from actions
- no transaction validation information is stored, as all information can be verified on the block information using the Chain API, dApps do not use History for that.
With those changes, the API format focus on delivering shorter response times, lower bandwidth overhead and easier usability for UI/UX developers.
The data structure is stored as follows:
Small but mighty
Changing format and cutting data redundancies reduces database size in about 85%, from almost 5 Tb to approximately 650 Gb. To further improve performance we engineered a multi-threaded indexer that extracts data from the state history plugin and makes it possible to ingest the complete EOS blockchain in approximately 72 hours with proper hardware optimization, while the current solutions can take weeks.
We also introduced an “ABI History Caching Layer” component to prevent deserialization failures when parallel processing historical data over ABI modifications.
For the database, we deployed an Elasticsearch cluster running on two custom assembled bare metal servers collocated on tier 1 infrastructure in Rio de Janeiro/Brazil.
The optimized data structure tends to reduce CPU and bandwidth consumption making infrastructure more scalable. Other BPs running full history APIs are already testing Hyperion and helping on its evolution.
Suggesting a new History API standard
For developers, delivering a flattened out result is better than today’s History API standard. The current eosio history plugin unnecessarily inflates the database with redundant information (for end-user history purposes). The possibility to filter inline actions allowed a reduction on API bandwidth consumption and coding complexity.
To accommodate those changes EOS Rio and other BPs developing history solutions are advocating for a History API V2 standard, to be adopted by the EOS community.
Please take the time to evaluate it and send us feedback.
An Open-Source Project
EOS Rio is already providing History API using Hyperion at https://eos.hyperion.eosrio.io/v2/docs/index.html . The project code and preliminary set up instructions are available at https://github.com/eosrio/Hyperion-History-API, we are releasing this under an open source license for non-commercial use.
We are available to assist anyone who wants to run Hyperion History and we are keen on feedback.
The immediate next step is to implement a WebSocket API for action streaming. That’s what we’re working now.
When we finish it, the next feature is Hyperion Analytics, an advanced layer on Hyperion History API to offer detailed EOS statistics.