• Author: @laser

    Proofs Walkthrough

    For infra (Part One):

    What artifacts are built and how are they distributed and consumed?
    What artifacts are accessed at runtime?
    What does "use small sectors" actually do?
    What is the status of PoRep and PoSt?

    For everybody else (Part Two):

    What is a sector base
    What responsibilities & code are in Rust
    Major APIs and how they are used
    Identifiers in use
    Metadata that needs to be kept
    What dials we have on performance of proofs
    What if any big changes are coming down the pipe

    Part One: Stuff Infra Cares About

    Build and Distribution


    When the master branch of rust-proofs-fil builds successfully in CircleCI, a gzipped-tarball gets pushed to GitHub releases ( We build once in CircleCI using an OSX container and once using a Linux container. A release is identified (tagged) by the first 16 characters of the build's Git SHA.

    This tarball contains:

    libfilecoin_proofs.a (a "static library")
    libfilecoin_proofs.h (corresponding C-header file)
    libfilecoin_proofs.pc (a pkg-config manifest, used to specify linker dependencies)
    paramcache (populates Groth parameter-cache - typically /tmp/filecoin-proof-parameters)
    paramfetch (fetches Groth parameters from IPFS gateway into cache)
    parameters.json (contains the most recently-published Groth parameters)

    Groth Parameters

    On occasion, a rust-fil-proof developer will want to publish new Groth parameters. They might do this because we changed some aspect of VDF PoSt or ZigZag PoRep setup or public parameters which obsoletes the previously-published parameters.

    To do this, the developer first runs paramcache, which populates the Groth parameter-directory on their computer with parameters which reflect the state of the rust-fil-proofs proofs code. Then, the developer runs parampublish. The parampublish program facilitates the process of adding the parameters to a local IPFS node and adds the new parameters to the parameters.json file in the root of rust-fil-proofs which is tracked by Git.

    Finally, the developer messages the pinbot with the cid of the Groth parameters which have been pinned at the IPFS gateway.

    Consuming Distributed Artifacts


    go-filecoin depends on rust-fil-proofs as a submodule. At any time, the specific Git SHA of rust-fil-proofs which is consumed by go-filecoin can be determined by running the following command from the root of go-filecoin:

    git ls-tree master proofs/rust-proofs

    When the deps command is run (either in the CircleCI or on a developer computer), the script is run.

    If both of the FILECOIN_USE_PRECOMPILED_RUST_PROOFS and GITHUB_TOKEN environment variables are set, the script attempts to download an OS-appropriate release tarball from rust-fil-proofs (GitHub) releases. The tarball, as previously mentioned, is identified by the first 16 characters of the Git SHA from which it was created. go-filecoin uses the SHA of its rust-fil-proofs submodule to construct a release URL and requests assets via cURL. Once downloaded, the precompiled assets are unpacked to locations hard-coded into the various go-filecoin Go files which consume them (e.g. rustverifier.go, rustsectorbuilder.go).

    If the aforementioned environment variables are not set or no matching release can be found, rust-fil-proofs is built from source from within the submodule directory. Once built, assets are unpacked to locations hard-coded into the various go-filecoin Go files which consume them (e.g. rustverifier.go, rustsectorbuilder.go).

    Groth Parameters

    Groth parameters can be fetched from IPFS using the paramfetch program or can be built using the paramcache program. These programs are part of rust-fil-proofs, and are available to go-filecoin developers who have downloaded a prebuilt rust-fil-proofs build. These programs can alternatively be built from source using the script which is run by the deps subcommand.

    The paramfetch program is provided with a path to the parameters.json which is part of the rust-fil-proofs submodule. Groth parameters in this JSON file are downloaded (via cURL) from the IPFS gateway and placed into the Groth parameter directory (typically /tmp/filecoin-proofs-parameters).

    What Artifacts are Accessed At Runtime?

    A short list:

    Groth parameters are accessed by rust-fil-proofs PoSt and PoRep generation and verification routines at runtime.
    params.out (loaded once, at startup - @laser to research why this exists)
    parameters.json is accessed by paramfetch at runtime.

    What is the status of PoRep and PoSt?


    ZigZag PoRep is used in the nightly and user clusters today. @dignifiedquire is working on an alternative to ZigZag ("boring") PoRep which involves something called accumulators. He and @porcuquine are meeting on Friday to make a go/no-go decision on accumulators. Lots of math involved.


    VDF PoSt is neither complete (from a math-and-science standpoint) nor integrated into go-filecoin. According to @porcuquine, "There are parts of the circuit which haven't been implemented." In short, there are "theory problems" which prevent PoSt from being completed.

    I, Sir @laser, am working to integrate our current, incomplete PoSt implementation with go-filecoin today. There are some challenges which have emerged over the last few weeks which prevent this from being a straightforward process:

    We didn't until recently have a way to sample the blockchain for the randomness required by the challenge seed-generation process.
    Our mechanism for sampling the chain only works from within the state machine.
    A VDF PoSt can be created over a fixed number of sectors (currently 2), which means that I need to modify go-filecoin to call the PoSt generator in a loop (in the case of a miner needing to prove more than 2 sectors in a proving period).

    What Does "Use Small Sectors" Actually Do?

    Every call through CGO from go-filecoin to rust-fil-proofs includes a configuration value. This value is either Live or Test.

    When the FIL_USE_SMALL_SECTORS environment variable is set to true, go-filecoin will pass the Test value on all CGO calls to rust-fil-proofs. Otherwise, Live will be passed.

    When rust-fil-proofs operations run with the Live configuration, sectors are configured to hold 266338304 bytes (254MiB). When Test is provided, sectors are configured to hold 1016 bytes of user data.

    Sealing time is the sum of replication time (scales linearly with sector size) and proof-generation time (grows logarithmically with sector size). So: Environments (and tests) which want to seal in the smallest amount of time possible should use the Test configuration, AKA "small" sectors.

    Each configuration (sector size) uses its own circuit, and each circuit its own Groth parameters. Today, we publish both parameters (in parameters.json) to IPFS.

    Part Two: WTF is a Sector Base?

    What is a Sector Base?

    Today, the "Sector Base" is a grab-bag of functions and classes (traits) scattered across 3 Rust crates in the rust-proofs repository. These crates make up 19% of the of the 32446 lines of Rust in the rust-fil-proofs repository.

    Bundled up in libfilecoin_proofs, these crates are used by go-filecoin to:

    Write (and "preprocess") user piece-bytes to disk
    Schedules seal (proof-of-replication) jobs
    Schedules proof-of-spacetime generation jobs
    Schedules unseal (retrieval) jobs
    Provides an interface to proof-of-spacetime and proof-of-replication-verification tools (called from within the FIL state machine)
    Map between replica commitment and sealed sector-bytes
    Map between piece key (base-58 or base-58BTC-encoded CID of user piece) and sealed sector-bytes

    How does go-filecoin interact with the rust-fil-proofs Sector Base?

    The go-filecoin codebase contains two Go interfaces which are used to interact with the Sector Base:

    type Verifier interface {
    	VerifyPoST(VerifyPoSTRequest) (VerifyPoSTResponse, error)
    	VerifySeal(VerifySealRequest) (VerifySealResponse, error)
    type SectorBuilder interface {
    	AddPiece(ctx context.Context, pi *PieceInfo) (sectorID uint64, err error)
    	ReadPieceFromSealedSector(pieceCid cid.Cid) (io.Reader, error)
    	SealAllStagedSectors(ctx context.Context) error
    	SectorSealResults() <-chan SectorSealResult
    	GetMaxUserBytesPerStagedSector() (uint64, error)
    	GeneratePoST(GeneratePoSTRequest) (GeneratePoSTResponse, error)
    	Close() error

    The Verifier interface is satisfied by the RustVerifier struct, and the SectorBuilder interface is satisfied with the RustSectorBuilder struct.

    Both the RustVerifier and RustSectorBuilder struct use CGO to call Rust code directly (in-process) in the Sector Base. CGO functions like C.CBytes are used to move bytes across CGO from Go to Rust; Go allocates in the Rust heap through CGO and then provides Rust with pointers from which it can reconstitute arrays, structs, and so forth.


    go-filecoin controls the lifecycle of the rust-fil-proofs SectorBuilder. When the node starts, a CGO call is made to Rust in order to construct a rust-fil-proofs SectorBuilder. Rust returns through its FFI system a pointer to the rust-fil-proofs SectorBuilder which go-filecoin holds on to. When go-filecoin is done with its RustSectorBuilder, it calls its destructor method which returns the pointer to Rust, allowing it to be freed.

    Calls through CGO

    Each method on RustSectorBuilder struct has a rust-fil-proofs SectorBuilder equivalent. When a method on the RustSectorBuilder is called, Go makes a CGO call to the Rust equivalent, passing along a pointer to the Rust-allocated, rust-fil-proofs SectorBuilder.

    For example, the go-filecoin SectorBuilder interface contains a method:

    AddPiece(ctx context.Context, pi *PieceInfo) (sectorID uint64, err error)

    which RustSectorBuilder turns into a CGO call to the following function in rust-fil-proofs:

    pub unsafe extern "C" fn add_piece(
        ptr: *mut SectorBuilder,
        piece_key: *const libc::c_char,
        piece_ptr: *const u8,
        piece_len: libc::size_t,
    ) -> *mut responses::AddPieceResponse


    Sealed sector path
    Yet-to-be-sealed ("unsealed") path
    Piece key (base-58 encoded CID of user piece-bytes)


    The rust-fil-proofs SectorBuilder stores some mappings in its metadata:

    base-58(BTC) encoded piece CID to sealed sector
    sealed and yet-to-be-sealed sector to file path
    replica commitment to file path

    Dials on Performance

    For now, go-filecoin has none.


    go-filecoin and rust-fil-proofs must live in same process
    go-filecoin does long polling to learn when sector has been sealed
    Can't scale rust-fil-proofs SectorBuilder to multiple machines
    No easy way for a miner to add more storage to an already in-use rust-fil-proofs SectorBuilder
    All rust-fil-proofs SectorBuilder metadata always buffered into memory
    A zillion other things

    Plans for the Future

    Our short-term plans include:

    Integrate our half-baked PoSt into go-filecoin
    Design PoRep and PoSt configuration
    Allow go-filecoin to make tradeoffs between time/space
    Support multiple sector sizes
    Brian V. involved in convos with @porcuquine
    @porcuquine has design in head but needs spec proposal
    NOTE: more sector sizes means more Groth parameters to be generated

Log in to reply




  • X

    It has come to my attention that storage clients wish to obtain the CommD and CommR associated with the sector into which the piece referenced in their storage deal has been sealed. The client can already use the query-deal command to obtain the CID of the commitSector message corresponding to that sector - but message wait doesn't show individual commitSector arguments - it just shows some string encoding of the concatenated arguments' bytes.
    I propose to augment the ProofInfo struct with CommR and CommD such that the storage client can query for their deal and, when available, see the replica and data commitments in the query-storage-deal command. Alternatively, the query-storage-deal code could get deal state from the miner and use the CommitmentMessage CID to look up CommR and CommD (on chain) - but this seems like more work than is really necessary.

  • X

    Hmmmm. These are just my thoughts off the cuff:

    For straight-ahead performance that's not specifically concerned with the issue of loading the data (from wherever), then work on smaller (1GiB) sectors is okay. When considering optimization for space so that large sectors can be replicated, 128GB for >1GiB sectors is obviously problematic from a normal replication perspective. However, if we consider the attacker who wants to replicate fast at any cost, then maybe it's okay.
    Based on this, we could probably focus on smaller sectors as a reasonable representation of the problem. This has the unfortunate consequence that the work is less applicable to the related problem of speeding replication even when memory must be conserved to some extent.
    I guess as a single datum to help calibrate our understanding of how R2 scales, it would be worth knowing exactly how much RAM is required for both 1GiB and (I guess) 2GiB. If the latter really fails with 128GB RAM, how much does it require not to? If the work you're already doing makes it easy to get this information, it might help us reason through this. I don't think you should spend much time or go out of your way to perform this investigation though, otherwise.
    Others may feel differently about any of this.

  • X

    If there does exist such a thing, I cannot find it.

    zenground0 [7 hours ago]
    I don't believe there is

    zenground0 [7 hours ago]
    tho maybe phritz has some "refactor the vm" issues that are relevant

    laser [7 hours ago]
    I assert to you that we must create an InitActor in order for CreateStorageMiner conform to the specification.

    Why [7 hours ago]
    I’ll take things I don’t disagree with for $400 Alex

    zenground0 [7 hours ago]
    Agreement all around. However this is one of those changes that is pretty orthogonal to getting the storage protocol to work and something we can already do. We need to track it but I see it as a secondary priority to (for example) getting faults or arbitrating deals working.

    anorth [3 hours ago]
    Thanks @zenground0, I concur. Init actor is in our high-level backlog, but I'm not surprised there's no issue yet. Is it reasonable to draw our boundary there for now?

  • X

    Does there already exist a story which addresses the need to create an InitActor?


Looks like your connection to Filecoin.cn中国爱好者社区 was lost, please wait while we try to reconnect.