cookbook 'automate-pkg-cleaner', '~> 0.1.0'
automate-pkg-cleaner (1) Versions 0.1.0 Follow0
Removes old Habitat packages after Chef Automate upgrades
cookbook 'automate-pkg-cleaner', '~> 0.1.0', :supermarket
knife supermarket install automate-pkg-cleaner
knife supermarket download automate-pkg-cleaner
Automate Cleanup Cookbook
Intelligently removes old and unused Habitat packages from Chef Automate servers using dynamic dependency analysis and a two-phase cleanup approach.
What it does
After upgrading Chef Automate, there are usually leftover Habitat packages that aren't needed anymore. This cookbook automatically identifies and removes them using intelligent dependency analysis and Habitat's native uninstall mechanism.
The cookbook uses a two-phase cleanup approach:
Phase 1: Specific Package Cleanup
- Targets known packages that accumulate multiple versions over time
- Uses Chef's
habitat_package
resource withkeep_latest '1'
to safely retain the newest version - Processes a predefined list of packages (customizable via attributes)
Phase 2: Intelligent Unused Package Detection
- Analyzes running services to identify what packages are actually in use
-
Uses transitive dependency analysis with
hab pkg dependencies --transitive
- Identifies truly unused packages that no running service depends on
- Respects whitelist protection for critical packages
- Implements retry logic for dependency conflicts during removal
Dynamic Package Detection
The cookbook automatically:
1. Scans running services using hab svc status
2. Builds dependency tree using Habitat's transitive dependency analysis
3. Identifies unused packages by comparing installed packages vs required packages
4. Protects critical packages using configurable whitelist patterns
5. Safely removes unused packages with intelligent retry logic for dependency conflicts
No manual package lists or CSV files required!
Retry Logic for Dependencies
The cookbook handles dependency conflicts intelligently:
- Initial Attempt: Tries to uninstall all target packages
- Dependency Detection: Identifies packages that can't be removed due to dependencies
- Retry Loop: Waits and retries removal of failed packages (dependencies may resolve as other packages are removed)
- Multiple Reports: Generates separate reports for each retry attempt
- Graceful Completion: Continues successfully even if some packages remain (provides guidance for re-running)
Requirements
- Chef Infra Client 16.0+
- Chef Automate Server Build 4.13.290 or higher
- Habitat installed on the system
- Root access (packages need sudo to remove)
- Works on Ubuntu, CentOS, RHEL, Amazon Linux
Configuration
Key attributes you can customize:
# Core settings node['automate-pkg-cleaner']['backup_dir'] = '/var/log/automate-pkg-cleaner' # Backup location node['automate-pkg-cleaner']['hab_binary'] = '/usr/bin/hab' # Habitat binary path node['automate-pkg-cleaner']['hab_pkgs_path'] = '/hab/pkgs' # Habitat packages directory node['automate-pkg-cleaner']['report_path'] = '/var/log/automate-pkg-cleaner/cleanup_report.json' # Report file # Version and retry settings node['automate-pkg-cleaner']['required_automate_version'] = '4.13.290' # Minimum Chef Automate version node['automate-pkg-cleaner']['max_retries'] = 5 # Maximum retry attempts node['automate-pkg-cleaner']['retry_delay_seconds'] = 30 # Delay between retry attempts # Package whitelist - protects critical packages from removal node['automate-pkg-cleaner']['package_whitelist'] = [ 'core/hab', # Protects ALL versions of core/hab 'core/hab-sup', # Protects ALL versions of core/hab-sup 'core/hab-launcher' # Protects ALL versions of core/hab-launcher ] # Phase 1 cleanup targets - packages to clean with keep_latest node['automate-pkg-cleaner']['packages_to_cleanup'] = [ 'chef/authn-service', 'chef/automate-cli', 'chef/automate-cs-bookshelf', 'chef/automate-cs-nginx', 'chef/automate-cs-oc-bifrost', 'chef/automate-cs-oc-erchef', 'chef/automate-cs-ocid', 'chef/deployment-service' ]
Package Whitelist Protection
The cookbook supports flexible whitelist patterns to protect critical packages:
Pattern Matching (Recommended)
# Protects ALL versions of specified packages node['automate-pkg-cleaner']['package_whitelist'] = [ 'core/hab', # All versions of hab 'core/hab-sup', # All versions of hab-sup 'chef/my-service' # All versions of my-service ]
Exact Version Protection
# Protects only specific versions node['automate-pkg-cleaner']['package_whitelist'] = [ 'core/hab/1.6.652/20220211034027', # Only this exact version 'chef/special-service/1.0.0/20230101000000' # Only this exact version ]
Mixed Protection
# Combine pattern and exact matching node['automate-pkg-cleaner']['package_whitelist'] = [ 'core/hab', # All versions (pattern) 'core/hab-sup', # All versions (pattern) 'chef/special/1.0.0/20230101000000' # Specific version (exact) ]
Customizing Phase 1 Cleanup
Modify the packages targeted for old version cleanup:
# Add or remove packages as needed node['automate-pkg-cleaner']['packages_to_cleanup'] = [ 'chef/authn-service', 'chef/automate-cli', 'chef/your-custom-package', # Add your packages # Remove any you don't want cleaned ]
How It Works
Phase 1: Targeted Package Cleanup
-
Processes predefined package list using
habitat_package
resource -
Keeps latest version of each package using
keep_latest '1'
- Safely removes old versions without breaking dependencies
- Logs each package operation for transparency
Phase 2: Dynamic Unused Package Detection
-
Analyzes running services using
hab svc status
-
Maps dependencies using
hab pkg dependencies --transitive
for each running service - Builds required package list from all running services and their dependencies
- Identifies unused packages by comparing total installed vs required packages
- Applies whitelist protection using pattern or exact matching
- Removes unused packages with retry logic for dependency conflicts
- Generates detailed reports for each cleanup phase
Intelligent Dependency Analysis
The cookbook uses Habitat's built-in dependency analysis:
- Transitive dependencies: Finds all packages required by running services (not just direct dependencies)
- Service-aware: Only considers packages needed by actually running services
- Conflict resolution: Retry logic handles temporary dependency conflicts during removal
- Safety first: Whitelist protection prevents removal of critical system packages
Checking Results
The cookbook generates comprehensive reports for both cleanup phases:
Main Reports:
# Primary cleanup report cat /var/log/automate-pkg-cleaner/cleanup_report.json # Consolidated report (if retries occurred) cat /var/log/automate-pkg-cleaner/cleanup_report_consolidated.json # Individual retry reports (if applicable) cat /var/log/automate-pkg-cleaner/cleanup_report_retry_1.json
Report Contents:
Each report includes:
- Phase 1 results: Packages cleaned with habitat_package
resource
- Phase 2 analysis: Running services, dependencies, unused packages identified
- Removal results: Successfully removed, failed, not found, and whitelisted packages
- Timing information: Duration of each phase and retry attempt
- Retry details: Which packages required multiple attempts and why
Sample Log Output:
[INFO] Starting cleanup of 8 specific packages using habitat_package resource
[INFO] Completed specific package cleanup using habitat_package resource
[INFO] Starting identification of unused packages
[INFO] Found 25 running services
[INFO] Found 180 required packages (including dependencies)
[INFO] Found 200 total installed packages
[INFO] Identified 20 unused packages
[INFO] 2 packages were skipped due to whitelist protection
[INFO] Starting removal of 18 packages with retry logic
Understanding the Process
Successful Two-Phase Completion:
Phase 1: Cleaned 8 specific packages with habitat_package resource
Phase 2: Processed 18 unused packages with helper function
=== CLEANUP COMPLETED SUCCESSFULLY ===
All unused packages were successfully removed
Phase 1 Success, Phase 2 Incomplete:
Phase 1: Cleaned 8 specific packages with habitat_package resource
Phase 2: Processed 20 unused packages with helper function
=== CLEANUP INCOMPLETE ===
5 packages could not be removed after 5 retries
RECOMMENDATION: Run this cookbook again later for complete cleanup
When to Re-run:
- If you see "CLEANUP INCOMPLETE" messages for Phase 2
- After other system maintenance that might resolve dependencies
- Wait some time between runs to allow Habitat's internal cleanup processes
- Phase 1 is always safe to re-run (uses
keep_latest
)
The cookbook is designed to be run multiple times safely until all cleanup is complete.
Dependent cookbooks
This cookbook has no specified dependencies.
Contingent cookbooks
There are no cookbooks that are contingent upon this one.
CHANGELOG
[1.0.0] - 2025-08-19
Added
- Initial release of automate_cleanup cookbook
- Support for removing Habitat packages listed in CSV file
- Comprehensive error handling and logging
- Dry-run mode for testing
- Package removal verification
- Backup functionality for current package state
- Detailed cleanup reporting in JSON format
- Support for parallel package removal
- Integration tests with Test Kitchen
- Unit tests with ChefSpec
- Example deployment scripts and configurations
Features
- CSV-based package management: Read package list from CSV file
- Dry-run mode: Test the cleanup process without actually removing packages
- Verification: Verify that packages are actually removed after cleanup
- Backup: Create backup of currently installed packages before cleanup
- Reporting: Generate detailed JSON reports of cleanup operations
- Error handling: Continue processing on errors with detailed logging
- Platform support: Ubuntu, CentOS, RHEL, Amazon Linux
- Parallel processing: Optional parallel package removal for faster execution
Configuration Options
- Configurable timeouts for Habitat commands
- Customizable log levels
- Flexible file paths for CSV, backups, and reports
- Option to continue or halt on package removal errors
Testing
- Comprehensive unit tests with ChefSpec
- Integration tests with Test Kitchen and InSpec
- Multiple test suites including dry-run scenarios
Collaborator Number Metric
0.1.0 failed this metric
Failure: Cookbook has 0 collaborators. A cookbook must have at least 2 collaborators to pass this metric.
Cookstyle Metric
0.1.0 passed this metric
No Binaries Metric
0.1.0 passed this metric
Testing File Metric
0.1.0 failed this metric
Failure: To pass this metric, your cookbook metadata must include a source url, the source url must be in the form of https://github.com/user/repo, and your repo must contain a TESTING.md file
Version Tag Metric
0.1.0 failed this metric
Failure: To pass this metric, your cookbook metadata must include a source url, the source url must be in the form of https://github.com/user/repo, and your repo must include a tag that matches this cookbook version number
0.1.0 failed this metric
0.1.0 passed this metric
No Binaries Metric
0.1.0 passed this metric
Testing File Metric
0.1.0 failed this metric
Failure: To pass this metric, your cookbook metadata must include a source url, the source url must be in the form of https://github.com/user/repo, and your repo must contain a TESTING.md file
Version Tag Metric
0.1.0 failed this metric
Failure: To pass this metric, your cookbook metadata must include a source url, the source url must be in the form of https://github.com/user/repo, and your repo must include a tag that matches this cookbook version number
0.1.0 passed this metric
0.1.0 failed this metric
Failure: To pass this metric, your cookbook metadata must include a source url, the source url must be in the form of https://github.com/user/repo, and your repo must contain a TESTING.md file
Version Tag Metric
0.1.0 failed this metric
Failure: To pass this metric, your cookbook metadata must include a source url, the source url must be in the form of https://github.com/user/repo, and your repo must include a tag that matches this cookbook version number
0.1.0 failed this metric