Named Entity Recognition for Security Intelligence

View in Github

A security-focused NER pipeline using Fenic's semantic extraction capabilities to identify and analyze threats, vulnerabilities, and indicators of compromise from unstructured security reports.

Overview

This pipeline demonstrates automated security entity extraction and risk assessment:

Zero-shot entity extraction (CVEs, IPs, domains, hashes)
Enhanced extraction with threat intelligence context
Document chunking for comprehensive analysis
Risk prioritization and actionable intelligence

Prerequisites

Install Fenic:

bash pip install fenic

Configure OpenAI API key: bash export OPENAI_API_KEY="your-api-key-here"

Usage

python ner.py

Implementation

The pipeline processes security reports through five stages:

Basic NER: Extract standard security entities
Enhanced NER: Add threat-specific context
Chunking: Handle long documents effectively
Analytics: Aggregate and analyze extracted entities
Risk Assessment: Generate actionable intelligence

Troubleshooting

Issue: Incomplete entity extraction Solution: Increase chunk size or adjust overlap percentage for better context

Issue: Missing threat actors or APT groups Solution: Add more specific descriptions in the Pydantic field definitions

Issue: Generic risk assessments Solution: Include more context about your organization in the assessment prompt