Turn 1,000 Cloud Alerts
Into 1 Strategic Action
Into 1 Strategic Action
Kassandra is an enterprise-grade CNAPP platform designed to map complex, multi-account cloud infrastructures into real-time directed graphs. By executing sub-millisecond path analysis and identifying tactical choke points, it isolates critical exposure and eliminates alert fatigue.
Core Ingestion & Posture Engine
Central coordinator for async telemetry collection and threat reasoning
Continuous Cloud Discovery & Event Streaming
Hybrid AWS API scanning with real-time CloudTrail event monitoring
AWS Auditing Plugins
Extensible AWS auditing plugins for security data collection
Runtime Intelligence & eBPF Sensor
Kernel-level eBPF telemetry for runtime process monitoring
Enterprise Graph Intelligence Core
Rust-accelerated attack path discovery with multi-layer validation
4-Layer Reachability Validation
4-layer attack path validation across Network, IAM, Data, and Controls
Policy Intelligence Engine
Local LLM-powered S3 policy analysis with zero data egress
Data Posture Management (DSPM)
Content-aware data security with Go-based S3 scanning
Interconnected Campaign Analysis
Consolidates alerts into MITRE-mapped attack scenarios
Threat Detection & Exposure Analysis
100+ severity mappings with toxic combination detection
Autonomous Red Teaming & Attack Path Validation
Deep dry-run attack simulation with real AWS API evidence
Business Context Engine
Translates technical scores into financial exposure metrics
Enterprise Auto-Remediation Engine
Dual-strategy automated remediation with SDK and Terraform
Solving Enterprise Multi-Account Cloud Exposure
Alert Fatigue
Analysts triage thousands of "Critical" alerts on non-sensitive assets daily.
→ BCE shifts focus to financial blast radius ($).
Lateral Movement Blindness
Attackers chain low-severity exposures; scanners miss multi-hop paths.
→ Yen's K-Shortest Path & Neo4j graph maps every route.
State Drift & Collisions
Manual changes during remediation break infrastructure stability.
→ LIFO rollback + Tarjan SCC auto-resolves deadlocks.
Data Ingest & Analysis Pipelines
Raw Ingestion, PII Auditing & Events Capture
Captures real-time syscall events (execve, connect) via Cilium Tetragon eBPF probes alongside CloudTrail. Performs immediate data sanitization, TCKN/MIME/CC-PAN detection, and masking at the gateway before queueing.
Transport & Buffering (SQS / DLQ)
Streams already-sanitized, anonymized high-velocity telemetry through Redis Streams (kassandra:discovery:stream) and buffers events in AWS SQS with Dead-Letter Queue (DLQ) backpressure fallback.
Normalization & Columnar Parsing
Transforms unstructured, masked JSON logs into a unified resource model. Uses a Rust-accelerated Apache Arrow parser for zero-copy memory layout and fast Go preprocessing.
Cognitive Decision Engine
Computes threat paths using Yen's K-Shortest Paths algorithm (NetworkX/Neo4j). Leverages Exponential Moving Average (EMA) and Z-score triggers to score path risk.
Governance & Feedback Loop
Calculates the Choke Point Importance Score (CPIS) by blending weighted betweenness centrality and blast radius size. Weighs financial exposure against SLA policies.
Autonomous Remediation Pipeline
Deploys targeted, transactional remediations (AWS key rotation, S3 PutBucketPolicy, eBPF XDP network blocks) via Terraform HCL and custom SDK scripts.
Upcoming Capabilities
Agentic AI
Autonomous agent swarm (Brain & Arns) executing strategy graphs, attack path simulations, and zero-trust policy orchestration.
AWS Platform Technical Roadmap
Planned cyber-economic autonomous defense platform extension detailing native AWS integration phases.
Multi-Cloud Security Posture Management
Unified cross-cloud compliance, threat telemetry, and security graphs for Azure and GCP, alongside advanced AWS telemetry enhancements.
Performance Comparison: Lab vs Production
Total System Execution
System Benchmark Status
Delay View
Synthetic and Production
Business Context Engine
Broad classification and metadata enrichment across legacy environments.
Test Results
- P95 52.05 ms, average 32.47 ms, 50 samples
- Regex engine: Python `re` module, cached patterns
Functional Scope
Detects PII (Personally Identifiable Information) in log files and database outputs. Uses regular expression pattern matching and Merkle tree-based integrity verification. Includes custom masks for AWS Account IDs (12-digit numbers), IPv4 addresses, and IAM ARN formats. Tested on a synthetic log containing 100,000 lines.
Production Risk & Scale Factors
Actual CloudTrail logs can range in size from 1 to 10 GB. Regex matching is limited by disk I/O. Expect a P95 of 150–300 ms. Additionally, encoding issues (non-UTF-8 characters) in actual logs can cause regex errors. Merkle tree calculation requires parallel processing.
DSPM
Data Security Posture Management. Sensitivity mapping core.
Test Results
- P95 52.65 ms, average 32.91 ms
- In-memory processing, no disk I/O.
Functional Scope
MIME type detection, magic number parsing, followed by content-based classification. Header parsing for CSV, stream parsing for PDF (OCR disabled, regex-based). 50 files of 10 MB each, totaling 500 MB. PII detection: TCKN (11 digits), credit card PAN (verified using the Luhn algorithm).
Production Risk & Scale Factors
Downloading a 10 MB file from S3 takes ~200–500 ms (depending on the region). PDF parsing can actually take 1–2 seconds using `PyPDF2` or `pdfplumber`. If OCR is enabled, it takes 5–10 seconds. Total time for a 500 MB file set is 30–60 seconds. P95 is 200–400 ms. Memory usage: 50 parallel files = 500 MB RAM; if the actual container limit is 1–2 GB, swap usage begins.
eBPF Monitor
Kernel-level observability for traffic flow and system calls.
Test Results
- P95 437.60 ms, average 107.94 ms
- 10 batches, each with 100 events
Functional Scope
Kernel-level syscall interception. Intercepts system calls such as `execve`, `connect`, and `openat`. Throughput of 1,000 events per second. Events are written to Neo4j in batches of 100 using the `UNWIND` Cypher query. The node risk score is updated in real time.
Production Risk & Scale Factors
The actual eBPF program is loaded into the kernel and transferred to userspace via `perf_buffer`. This process takes 0.1–1 ms. The main latency is in the Neo4j `UNWIND` batch write. There is 5–10 ms of network latency in a cross-AZ Neo4j cluster. A batch of 100 events takes 10–20 ms. A P95 of 80–150 ms is reasonable. However, a sustained throughput of 1,000 events per second can exhaust the Neo4j connection pool (30 sessions). If the pool fills up, requests fall into the queue, and the P95 can rise to 300–500 ms.
SQS Pipeline
Async message handling for high-throughput logs.
Test Results
- SQS + DB latency was simulated using `asyncio.sleep`.
Functional Scope
This is a high-performance data integration architecture that retrieves 2,000 messages from AWS SQS cost-effectively using a smart queuing method, groups them into 20 batches, and transfers them to PostgreSQL via 50 standby database connections with minimal resource consumption and maximum speed.
Production Risk & Scale Factors
SQS `ReceiveMessage` waits 20 seconds on an empty queue (long polling). If a message is present, it returns immediately. 100 messages per chunk, with `MaxNumberOfMessages=10`, requires 10 calls. Each call takes 50–100 ms. A PostgreSQL write takes 10–20 ms. Total: 60–120 ms per chunk. P95: 200–300 ms. However, if the SQS visibility timeout (30 seconds) expires, the message is received again, resulting in duplicate processing. If the PostgreSQL connection pool (50) is full, a `QueuePool` error occurs, requiring a retry.
RustCore
High-performance logic gate for critical system checks.
Test Results
- 99ms Deterministic execution time for in-memory graph queries with 5 throttle retry behaviors.Delay injection is dominant
- 100 calls, 5 throttle retries.
Functional Scope
This technology is a system that calculates the shortest alternative routes on a 1,000-node network at lightning speed. The calculation results are transferred to the Python program as a single compressed package; this eliminates delays in cross-language data transfer, thereby increasing system performance and processing speed many times over.
Production Risk & Scale Factors
500ms Fixed-rate polling and serialization boundaries.
APVE Logic
Advanced Policy Verification. Heavy computation layer.
Test Results
- 1108ms Stress load execution latency with multiple simulated paths.
Functional Scope
This architecture is an intelligent traffic and load management system that protects AWS authentications with a limit of 5 requests per second and a 5-second timeout, while routing delayed transactions to a dynamically resource-shared background queue to avoid disrupting system performance.
Production Risk & Scale Factors
5000ms+ Concurrency lock delays and AWS API rate limits. Background queues are utilized to deliver the first 5 paths instantly.Real AWS IAM `SimulatePrincipalPolicy`: 200–500 ms. 10 requests = 2–5 seconds. Token Bucket: 5/second = runs out in 2 seconds. The 6th path waits 1 second. 5-second timeout = the 5th path completes at the last moment, and the 6th path times out. In reality, the timeout rate is 30–50%. However, the Borrow+Steal background queue verifies these paths later. The first 5 paths are displayed to the user immediately; the rest are processed in the background. UX: “5 paths verified, 5 paths pending” message.
Known Technical Debt
GIL Acquisition Delay
Actual: 5–15 ms; target: 1 ms. `rmp-serde` batching is required.
PostgreSQL DSPM write
50 parallel inserts will cause lock contention if the `dspm_findings` table is not indexed. A `COPY` or batch insert is required.
Neo4j Connection Pool
30 sessions are insufficient for a sustained throughput of 100–200 events per second. Either the connection pool must be increased or an asynchronous driver must be used.
