Build your Engine

This page explains how to turn the Engine template into a real detection Engine.

At a high level, your Engine will:

receive a bounty (artifact + rules + deadlines)
analyze the artifact using your tooling
return an analysis response:
- verdict: malicious, benign, or unknown
- bid: how much NCT you stake (within the bounty bid rules)
- metadata: optional context (for example malware_family, confidence)

Note Setup and installation steps live in Quickstart. This page focuses on implementation patterns and the analyzer logic.

Core implementation point: `analyze(bounty)`

In the template, the main implementation point is the analyzer function (commonly named analyze(bounty)).

Your analyzer should:

branch early on artifact type (and other attributes if needed)
retrieve the artifact (bytes, stream, or temp file path)
run your detection logic (local tool, service, or remote API)
map tool output into a stable verdict, bid, and metadata

Architectures

Depending on the scanning technology you are using, you will typically implement one of these patterns:

Command Line Scanner
Remote API
Local Service

Command Line Scanner

If your tool runs via CLI, update analyze() to:

get the artifact (often via a temp file)
run your CLI tool with a timeout
parse output to a stable verdict and metadata
return an analysis dict

Tips:

always enforce timeouts
capture exit codes and stderr
keep output parsing resilient and consistent

Remote API

If your tool is a remote API that accepts files and/or URLs:

get the bounty (and sometimes the artifact)
submit an analysis request to the remote API
poll (or async callback) until completion
parse results
return an analysis dict

A key decision is how to pass file artifacts:

if the remote API can accept a URL to download the sample, prefer passing artifact_uri
otherwise, your Engine will download then upload (slower, more bandwidth)

For URL artifacts, you can usually pass bounty.artifact_uri directly as the scan target.

Local Service

If your tool runs locally as a daemon (for example ClamAV):

download the artifact (bytes or temp file)
connect to the local service (socket/HTTP)
send the artifact or path
parse results
return an analysis dict

With local services, you also need surrounding machinery to ensure the service is running before the Engine starts handling bounties.

Branching on bounty attributes

Many Engines need to take different actions based on the bounty contents, for example:

Engines that process URLs, domains, and IPs differently
Engines that process both file and URL artifact types
Engines that only support a limited set of file mimetypes

The examples below assume polyswarm_engine is available as ps:

import polyswarm_engine as ps

File vs URL artifact type

Branch early so you either handle the artifact type explicitly or return a safe UNKNOWN response.

if ps.is_file_artifact(bounty):
    do_file()
else:
    do_url()

If you only support one type, reject the others:

if not ps.is_file_artifact(bounty):
    return ps.UNSUPPORTED

Detecting IP vs domain vs URL

The bounty does not always indicate whether a URL artifact is an IP, a domain, or a full URL. Use bounty.artifact_uri as input to your own parsing and branching:

if ps.is_url_artifact(bounty):
    target = bounty.artifact_uri
    if is_ip(target):
        do_ip_task()
    elif is_domain(target):
        do_domain_task()
    else:
        do_url_task()

Detecting mimetypes

If your Engine only supports specific mimetypes, check before scanning:

SUPPORTED_MIMETYPES = ["mimetype1", "mimetype2", "mimetype3"]

if ps.is_file_artifact(bounty) and bounty.mimetype in SUPPORTED_MIMETYPES:
    do_file()
else:
    return ps.UNSUPPORTED

Minimal analyzer example

Start by branching on artifact type and returning UNKNOWN for anything you do not support.

import polyswarm_engine as ps

@engine.register_analyzer
def analyze(bounty: ps.Bounty) -> ps.Analysis:
    # Only handle file artifacts in this example
    if not ps.is_file_artifact(bounty):
        return ps.UNSUPPORTED

    # Download the artifact to a temporary file for scanners that expect a file path
    with ps.ArtifactTempfile(bounty) as path:
        result = my_scanner(path)  # Replace with your tooling

    # Map your scanner output into PolySwarm analysis fields
    if result.malicious:
        return {
            "verdict": ps.MALICIOUS,
            "bid": ps.bid_max(bounty),
            "metadata": {
                "malware_family": result.family,
                "confidence": float(result.confidence),
            },
        }

    return {
        "verdict": ps.BENIGN,
        "bid": ps.bid_min(bounty),
        "metadata": {"confidence": float(result.confidence)},
    }

Returning a valid analysis

Your return payload must validate against the Engine schema rules.

Verdict rules

Use verdicts consistently:

MALICIOUS: strong detection
BENIGN: strong evidence it is clean
UNKNOWN: unsupported type, failed processing, timeouts, or low confidence

Bid rules

If you return MALICIOUS or BENIGN, your bid must be within the bounty range: MIN_BID <= bid <= MAX_BID.
- MIN_BID and MAX_BID are provided in the bounty rules for each bounty.
If you return UNKNOWN (including UNSUPPORTED), your bid must be 0.

Use helpers like ps.bid_min(bounty) and ps.bid_max(bounty) to stay inside the range.

Metadata discipline

Keep metadata consistent and meaningful:

include malware_family when you have a stable family label
only include confidence if you have a real signal behind it (float 0.0 to 1.0)
avoid dumping raw tool output or unbounded strings

Bidding

Bidding expresses confidence with stake.

start conservative while validating reliability and accuracy
only increase bids where your signal is stable and repeatable
always follow the bid rules for the verdict you return

See Bidding Strategy for bid sizing guidance.

Build your Engine

Core implementation point: `analyze(bounty)`

Architectures

Command Line Scanner

Remote API

Local Service

Branching on bounty attributes

File vs URL artifact type

Detecting IP vs domain vs URL

Detecting mimetypes

Minimal analyzer example

Returning a valid analysis

Verdict rules

Bid rules

Metadata discipline

Bidding

Articles in This Section

Quickstart

Request verification

Engines

Build your Engine

Core implementation point: analyze(bounty)

Architectures

Command Line Scanner

Remote API

Local Service

Branching on bounty attributes

File vs URL artifact type

Detecting IP vs domain vs URL

Detecting mimetypes

Minimal analyzer example

Returning a valid analysis

Verdict rules

Bid rules

Metadata discipline

Bidding

Articles in This Section

Quickstart

Request verification

Engines

Core implementation point: `analyze(bounty)`