Use Case · Batch ETL + AI Inference
The customer's batch ETL pipeline produced records that needed to be classified by an ONNX model before landing. The architecture was the standard one: NiFi handled extraction, transform, and routing; a separate model-serving layer ran inference; results were rejoined downstream. Two clusters, two control planes, two on-call rotations, one logical job.
The end-to-end runtime on the reference workload was 690 seconds. The team had already spent a quarter tuning batch sizes, model concurrency, and serialization between the two stacks.
The problem wasn't the model. It wasn't the ETL. It was the boundary between them. Every record had to leave the data plane, cross the wire to the inference service, and come back — for every batch, every run.
Falcon was scoped against the customer's reference job in a Test Flight. The same input data, the same ONNX model, the same expected outputs — measured against both the existing NiFi pipeline and a PySpark equivalent the team had built for comparison.
Falcon ran the reference job in 12 seconds. The same job took 168 seconds on PySpark and 690 seconds on the existing NiFi pipeline. The numbers are not a tuning win — they are the difference between an interpreted, network-bounded, multi-cluster topology and a single compiled binary that loads the model once and runs it inline.
The architectural collapse mattered as much as the runtime. The Falcon deployment removed the model-serving cluster, the orchestration glue between stacks, and the per-batch network round trip. The same job that required two systems and a coordination layer was now a single binary the team could ship to a forward node.
"Reduces processing time and compute needs and is the best we have seen. Very fitting for the Army's edge compute needs."
A separate technical reviewer on the engagement put it more bluntly: "When we first met, what you told me sounded too good to be true. From the tests we've run, you've kept your word."
Three architectural properties drove the result. None of them are achievable by tuning the existing stack.
This is why the comparison reads as a step change. NiFi and PySpark were optimized as far as they could be; the boundary between data and inference was the floor. Falcon removes the boundary.
Test Flight
We'll run it as a single Falcon pipeline and benchmark against your existing topology. Free. 2-4 weeks. Apples-to-apples results, your model, your data.
Related