Company News

How We Built Chainalysis’ Robust Knowledge Graph for Solana Transactions

The Solana blockchain has several uniquely sophisticated features, including an account structure with system, token, and stake accounts, the ability to transfer funds by changing account ownership, and distinctive treatment of collateral in smart contracts. These characteristics can present challenges for investigators, compliance analysts, and other ecosystem participants seeking easy-to-understand models for transaction activity. Without a precise knowledge graph, anyone involved in blockchain analysis could reach incorrect conclusions. 

In this blog, we’ll highlight how our research and development team accounted for Solana’s defining characteristics to build out a robust knowledge graph of the blockchain and make investigation and compliance workflows more intuitive and fundamentally trusted. 

Solana’s unique account structure and fund transfers through account ownership changes

First, let’s examine the account mechanics on Ethereum to better contextualize different account types in Solana.

Ethereum wallets support the storage of different types of assets, both fungible and non-fungible. An Ethereum wallet is represented by a single address, which is attributed to an entity or individual, and can be unlocked by a private key. 

Unlike Ethereum and other account-based blockchains, Solana has a unique account structure with multiple address types. Solana’s primary addresses are known as System Account Addresses and they have control over all addresses beneath them. System Account Addresses are unlocked by private keys; the other remaining addresses within the wallet do not have designated private keys. 

Under each System Account Address, there are two types of Inventory Account Addresses

  1. Stake Accounts are used to delegate tokens to network validators to potentially earn rewards. 
  2. Token Accounts come in two forms: Ancillary Token Accounts hold SPL tokens (the equivalent of ERC20s) or NFTs. Users can create as many of these accounts as they want. For example, a user could create multiple accounts that each hold an amount of e.g. USDC.
    Associated Token Accounts are addresses designated for particular token mints. These accounts simplify SPL token transfers if recipients have multiple Ancillary Token Accounts with the same token. Additionally, users cannot send SPL tokens if recipients do not have Ancillary Token Accounts for the SPL tokens in question. Associated Token Accounts allow users to send SPL tokens even if recipients do not have Ancillary Token Accounts of that mint. 

To create a knowledge graph for Solana, we employed our sophisticated blockchain analytics engine to carry out the clustering process, which we use to group together addresses controlled by the same real-world entity or service. Clustering allows us to start with a single address attributed to a custodian and scale up to identify millions of addresses controlled by that custodian, allowing our users to view more of the activity associated with that custodian. The clustering process typically consists of the following steps (see the below image representing the Chainalysis Clustering Engine for a visual representation): 

  1. Chainalysis operates full archival nodes and transforms the raw blockchain data retrieved from those nodes into structured and indexed blockchain data. We monitor and detect every interaction on the blockchain that is related to a value transfer.
  2. Our investigative team collects ground-truth identifications on individual services and businesses that operate on-chain. We also supplement this data with identifications shared by our network.
  3. We feed all of this information into our clustering engine. Hundreds of algorithms detect the unique fingerprint of a given service by analyzing the data produced in steps 1 and 2.
  4. We compare the outputs of all the algorithms in a Graph Traversal and Collision Resolution Engine to ensure internal consistency.

In Solana, clustering maps activity conducted by Token and Stake Accounts to their corresponding System Accounts and consolidates all System Accounts if they are controlled by the same on-chain entity. However, this process also needs to address how funds are transferred to ensure that both historical and current transactions are attributed to the correct owners. 

Solana provides two methods for transferring funds. The first involves simply transferring the tokens in question from one account to another, similar to transfers on Ethereum. The second process involves changing the associated ownership of a Token or Stake Account, which is more complicated. Any Solana tracing solution that doesn’t account for this runs the risk of attributing the historical activity of an account’s old owner to the new owner, posing a major problem for investigations.

The diagram below demonstrates an example of this phenomenon. On the left, we see all transactions conducted by Alice’s and Bob’s respective token accounts before any ownership changes. On the right, after Alice changes the ownership of her account and clustering is applied normally, Bob appears to inherit all of Alice’s token account’s historical transactions. This conclusion is inaccurate because Bob is now falsely associated with Alice’s historical activity. 

For SPL token transfers, changing account ownership is a rare phenomenon. We estimate that this occurs less than 1% of the time. For NFT transfers, we estimate that changing account ownership might occur more than 20% of the time, depending on the NFT marketplace involved. Even in the case of SPL token transfers, errors resulting from that 1% can have cascading effects on clustering. 

To correctly map transactions and eliminate concerns regarding Solana’s account complexities, we created specialized components that:

  • monitor the creation, closure, and activity of all Token and Stake Accounts.
  • track all of their current and historical owners.
  • model ownership changes as value transfers and roll those transfers up to the System Account level.

With these modifications, our Solana model is now similar to what customers see for Ethereum and other account-based blockchains. 

How Solana treats collateral

Another challenge within Solana is its treatment of collateral. Solana accounts still technically hold collateral used to receive new assets in addition to the new assets themselves, resulting in double counting when wallet wealth is computed. 

By default, this is how transactions involving SOL and wrapped SOL (wSOL) appear on the blockchain: 

The interpretation of the above diagram leads to double counting of a single transaction reflected in the total balance of Wallet A. The inflation of a wallet’s assets will lead to inaccurate conclusions about the amount of funds that can be recovered from an asset seizure or the size of exposure to a given entity. This is in stark contrast to Ethereum, where transactions involving deposited ETH to receive wETH are only counted once, and therefore do not require modifications.

Blockchain Wallet Balance under native coin view Wallet Balance under wrapped token view Total Wallet Balance
Ethereum 0 ETH 10 wETH 10 wETH
Solana 10 SOL (collateral) 10 wSOL 10 SOL + 10 wSOL

 

To ensure collateral is not double-counted as part of a Solana wallet’s wealth, our R&D team created a synthetic address to represent the “escrow” (the collateral contract). In Step 2 in the below diagram, the SOL collateral is moved to an escrow address, which omits double-counting. 

Additional modifications for user-friendly Solana analysis

Ninety percent of transfers in Solana involve reward-rent, vote, or non-vote transaction fees. When investigators and analysts graph the transfers of a given entity, this high number of transactions would make it very difficult to see the small number of value transfers relevant to their research. 

The Chainalysis R&D team tackled this obstacle by compressing the aforementioned transfers on Solana into a single synthetic transaction for each transaction type every 24 hours. This preserves the intelligence on these transfers while reducing unnecessary information for the user. 

How Solana support impacts our customers

As new blockchain protocols are developed – each with unique characteristics – it will be necessary to tweak corresponding research and transaction models for greater efficiency and understanding. Solana support in our investigative suite demonstrates the importance of simple and accurate models to improve investigation and compliance within an otherwise complex network. Our research and development team was able to remove complexity of token account ownership changes by modeling and visualizing changes as value transfers, and map transfers to historical owners to provide accurate investigative conclusions and reliable risk calculations. 

Many of our customers have already begun using our Solana support for enhanced blockchain analysis. Maxim Piessen, Co-founder and CTO of Solana-based decentralized credit marketplace Credix, told us about how our support for Solana will help them. “Working with Chainalysis throughout their onboarding of Solana, we were really impressed with the rigor they apply to make sure accurate, actionable findings are surfaced. Advanced transaction monitoring and risk assessment are key to the trust and transparency of our platform; therefore, the reliability of the underlying data is paramount. As we continue to innovate and expand, our vision remains – to build the future of global credit markets.”

At Chainalysis, we will continue to improve our product offerings and broaden our customer base for robust results. Please direct any questions about Solana support to [email protected].

This material is for informational purposes only, and is not intended to provide legal, tax, financial, or investment advice. Recipients should consult their own advisors before making these types of decisions. Chainalysis has no responsibility or liability for any decision made or any other acts or omissions in connection with Recipient’s use of this material.

Chainalysis does not guarantee or warrant the accuracy, completeness, timeliness, suitability or validity of the information in this report and will not be responsible for any claim attributable to errors, omissions, or other inaccuracies of any part of such material.