Enterprise Analysis
ArchiCore supports enterprise-scale projects with 50,000+ files through intelligent sampling, tiered analysis, and incremental indexing.
Overview
Large codebases like GitLab, Kubernetes, or enterprise monorepos require special handling. ArchiCore provides:
- Tiered Analysis - Choose between quick overview and deep analysis
- Smart Sampling - AI-powered file selection based on importance
- Incremental Indexing - Only re-index changed files
- Focus Directories - Prioritize critical parts of your codebase
Analysis Tiers
| Tier | Max Files | Use Case |
|---|---|---|
| Quick | 1,000 | Fast exploration, architecture overview |
| Standard | 5,000 | Day-to-day development, balanced analysis |
| Deep | 50,000 | Comprehensive analysis, security audits (Enterprise tier) |
Tier Capabilities
Quick Tier:
├── Structure analysis ✅
├── Dependency graph ✅
├── Basic metrics ✅
├── Semantic search ❌
├── Security scanning ❌
└── Duplication detection ❌
Standard Tier:
├── Structure analysis ✅
├── Dependency graph ✅
├── Full metrics ✅
├── Semantic search ✅
├ ── Security scanning ✅
└── Duplication detection ✅
Deep Tier (Enterprise):
├── Full codebase analysis ✅
├── Complete graph ✅
├── Comprehensive metrics ✅
├── Semantic search ✅
├── Security scanning ✅
└── Duplication detection ✅
Sampling Strategies
When analyzing large projects, ArchiCore uses sampling to select the most important files.
Smart (Recommended)
AI-powered selection based on:
- File importance (entry points, configs, core modules)
- Import frequency (most imported files)
- Git activity (frequently changed files)
- Directory depth (shallower = more important)
archicore
> /enterprise standard smart
Hot Files
Selects files with the most git commits in the last year. Best for understanding active development areas.
archicore
> /enterprise standard hot-files
Directory Balanced
Equal representation from each top-level directory. Best for exploring unfamiliar codebases.
archicore
> /enterprise quick directory-balanced
Random
Random sampling. Useful for statistical analysis.
archicore
> /enterprise quick random
CLI Usage
Estimate Project Size
Before running analysis, estimate the project:
archicore
> /estimate
Output:
Project Size Estimation
━━━━━━━━━━━━━━━━━━━━━━
Total Files 92,135
Total Size 927 MB
Recommended Tier QUICK
Estimated Time ~8 minutes
Language Distribution:
.rb 46,386 (50.3%)
.js 10,075 (10.9%)
.vue 4,333 (4.7%)
Available Tiers:
quick - 1,000 files - Fast overview
standard - 5,000 files - Balanced analysis
deep - 50,000 files - Full analysis (Enterprise)
Run Enterprise Analysis
# Quick analysis with smart sampling
archicore
> /enterprise quick
# Standard analysis with hot-files strategy
> /enterprise standard hot-files
# Deep analysis (Enterprise tier required)
> /enterprise deep smart
Web Dashboard
- Open your project in the dashboard
- Click the Enterprise button
- Review project size and recommendations
- Select tier and sampling strategy
- Click Start Analysis
The modal shows:
- Total files and size
- Language distribution
- Tier options with file counts
- Sampling strategy selector
API Usage
Get Project Estimate
GET /api/projects/:id/enterprise/estimate
Response:
{
"totalFiles": 92135,
"totalSizeMB": 927.5,
"recommendation": "quick",
"estimatedTimeMinutes": 8,
"languageDistribution": {
".rb": 46386,
".js": 10075,
".vue": 4333
},
"tiers": {
"quick": {
"maxFiles": 1000,
"estimatedFiles": 1000,
"description": "Fast overview - structure and basic metrics only"
},
"standard": {
"maxFiles": 5000,
"estimatedFiles": 5000,
"description": "Standard analysis - all features with sampling"
},
"deep": {
"maxFiles": 50000,
"estimatedFiles": 50000,
"description": "Deep analysis - comprehensive analysis"
}
}
}
Start Enterprise Indexing
POST /api/projects/:id/enterprise/index
Content-Type: application/json
{
"tier": "standard",
"sampling": {
"enabled": true,
"strategy": "smart"
},
"focusDirectories": ["src/core", "src/api"]
}
Response:
{
"taskId": "task_abc123",
"status": "pending",
"message": "Enterprise indexing task queued"
}
Preview Files to Analyze
GET /api/projects/:id/enterprise/files?tier=standard&strategy=smart
Response:
{
"tier": "standard",
"strategy": "smart",
"totalSelected": 5000,
"maxAllowed": 5000,
"files": [
"src/index.ts",
"src/server/index.ts",
"package.json"
],
"hasMore": true
}
Incremental Indexing
Re-index only files changed since a specific date or commit:
POST /api/projects/:id/enterprise/incremental
Content-Type: application/json
{
"since": "2024-01-01"
}
Or using a git commit hash:
{
"since": "abc123f"
}
SDK Usage
JavaScript/TypeScript
import { ArchiCore } from '@archicore/sdk';
const client = new ArchiCore({ apiKey: 'your-api-key' });
// Get estimate
const estimate = await client.projects.enterpriseEstimate('project-id');
console.log(`Recommended: ${estimate.recommendation}`);
console.log(`Total files: ${estimate.totalFiles}`);
// Start enterprise analysis
const task = await client.projects.enterpriseIndex('project-id', {
tier: 'standard',
sampling: { strategy: 'smart' }
});
// Preview files
const preview = await client.projects.enterpriseFilesPreview('project-id', {
tier: 'standard',
strategy: 'smart'
});
// Check incremental changes
const changes = await client.projects.enterpriseIncremental('project-id', {
since: '2024-01-01'
});
Python
from archicore import ArchiCore
client = ArchiCore(api_key="your-api-key")
# Get estimate
estimate = client.projects.enterprise_estimate("project-id")
print(f"Recommended: {estimate['recommendation']}")
print(f"Total files: {estimate['totalFiles']}")
# Start enterprise analysis
task = client.projects.enterprise_index(
"project-id",
tier="standard",
sampling_strategy="smart"
)
# With focus directories
task = client.projects.enterprise_index(
"project-id",
tier="deep",
focus_directories=["src/core", "src/api"],
exclude_patterns=["test", "spec"]
)
# Incremental indexing
client.projects.enterprise_index(
"project-id",
tier="standard",
incremental_since="2024-01-01"
)
Best Practices
For New Projects
- Run
/estimateto understand project size - Start with
quicktier for initial exploration - Use
standardfor regular development - Reserve
deepfor critical audits
For CI/CD Integration
# .gitlab-ci.yml
archicore-analysis:
script:
- archicore enterprise standard smart
only:
- main
schedule:
cron: "0 2 * * 1" # Weekly on Monday 2AM
For Incremental Updates
# Index only files changed since last deployment
archicore
> /enterprise standard --since=$LAST_DEPLOY_SHA
Pricing
| Plan | Quick | Standard | Deep |
|---|---|---|---|
| Free | ✅ | ❌ | ❌ |
| Pro | ✅ | ✅ | ❌ |
| Team | ✅ | ✅ | ❌ |
| Enterprise | ✅ | ✅ | ✅ |
FAQ
How does smart sampling work?
Smart sampling uses multiple heuristics:
- Entry points (index.ts, main.py, app.js) get highest priority
- Config files (package.json, tsconfig.json) are always included
- Frequently imported files are prioritized
- Recently changed files (git history) score higher
- Test files are deprioritized but not excluded
Can I analyze the entire codebase?
Yes, with the Deep tier (Enterprise subscription). For 90K+ file projects, this may take 10-15 minutes.
What about memory usage?
ArchiCore uses streaming processing and limits memory to ~2GB by default. For very large projects, the server may need more RAM.
Can I focus on specific directories?
Yes! Use focusDirectories option:
await client.projects.enterpriseIndex('project-id', {
tier: 'standard',
focusDirectories: ['src/critical', 'src/api']
});
Files in focus directories get priority during sampling.