Named Entity Recognition for Security Intelligence
A security-focused NER pipeline using Fenic's semantic extraction capabilities to identify and analyze threats, vulnerabilities, and indicators of compromise from unstructured security reports.
Overview
This pipeline demonstrates automated security entity extraction and risk assessment:
- Zero-shot entity extraction (CVEs, IPs, domains, hashes)
- Enhanced extraction with threat intelligence context
- Document chunking for comprehensive analysis
- Risk prioritization and actionable intelligence
Prerequisites
- Install Fenic:
bash
pip install fenic
- Configure OpenAI API key:
bash export OPENAI_API_KEY="your-api-key-here"
Usage
python ner.py
Implementation
The pipeline processes security reports through five stages:
- Basic NER: Extract standard security entities
- Enhanced NER: Add threat-specific context
- Chunking: Handle long documents effectively
- Analytics: Aggregate and analyze extracted entities
- Risk Assessment: Generate actionable intelligence
Troubleshooting
Issue: Incomplete entity extraction Solution: Increase chunk size or adjust overlap percentage for better context
Issue: Missing threat actors or APT groups Solution: Add more specific descriptions in the Pydantic field definitions
Issue: Generic risk assessments Solution: Include more context about your organization in the assessment prompt