Adoptable Cookbooks List

Looking for a cookbook to adopt? You can now see a list of cookbooks available for adoption!
List of Adoptable Cookbooks

Supermarket Belongs to the Community

Supermarket belongs to the community. While Chef has the responsibility to keep it running and be stewards of its functionality, what it does and how it works is driven by the community. The chef/supermarket repository will continue to be where development of the Supermarket application takes place. Come be part of shaping the direction of Supermarket by opening issues and pull requests or by joining us on the Chef Mailing List.

Select Badges

Select Supported Platforms

Select Status

RSS

automate-pkg-cleaner (1) Versions 0.1.0

Removes old Habitat packages after Chef Automate upgrades

Policyfile
Berkshelf
Knife
cookbook 'automate-pkg-cleaner', '~> 0.1.0', :supermarket
cookbook 'automate-pkg-cleaner', '~> 0.1.0'
knife supermarket install automate-pkg-cleaner
knife supermarket download automate-pkg-cleaner
README
Dependencies
Changelog
Quality 40%

Automate Cleanup Cookbook

Intelligently removes old and unused Habitat packages from Chef Automate servers using dynamic dependency analysis and a two-phase cleanup approach.

What it does

After upgrading Chef Automate, there are usually leftover Habitat packages that aren't needed anymore. This cookbook automatically identifies and removes them using intelligent dependency analysis and Habitat's native uninstall mechanism.

The cookbook uses a two-phase cleanup approach:

Phase 1: Specific Package Cleanup

  • Targets known packages that accumulate multiple versions over time
  • Uses Chef's habitat_package resource with keep_latest '1' to safely retain the newest version
  • Processes a predefined list of packages (customizable via attributes)

Phase 2: Intelligent Unused Package Detection

  • Analyzes running services to identify what packages are actually in use
  • Uses transitive dependency analysis with hab pkg dependencies --transitive
  • Identifies truly unused packages that no running service depends on
  • Respects whitelist protection for critical packages
  • Implements retry logic for dependency conflicts during removal

Dynamic Package Detection

The cookbook automatically:
1. Scans running services using hab svc status
2. Builds dependency tree using Habitat's transitive dependency analysis

3. Identifies unused packages by comparing installed packages vs required packages
4. Protects critical packages using configurable whitelist patterns
5. Safely removes unused packages with intelligent retry logic for dependency conflicts

No manual package lists or CSV files required!

Retry Logic for Dependencies

The cookbook handles dependency conflicts intelligently:

  1. Initial Attempt: Tries to uninstall all target packages
  2. Dependency Detection: Identifies packages that can't be removed due to dependencies
  3. Retry Loop: Waits and retries removal of failed packages (dependencies may resolve as other packages are removed)
  4. Multiple Reports: Generates separate reports for each retry attempt
  5. Graceful Completion: Continues successfully even if some packages remain (provides guidance for re-running)

Requirements

  • Chef Infra Client 16.0+
  • Chef Automate Server Build 4.13.290 or higher
  • Habitat installed on the system
  • Root access (packages need sudo to remove)
  • Works on Ubuntu, CentOS, RHEL, Amazon Linux

Configuration

Key attributes you can customize:

# Core settings
node['automate-pkg-cleaner']['backup_dir'] = '/var/log/automate-pkg-cleaner'            # Backup location  
node['automate-pkg-cleaner']['hab_binary'] = '/usr/bin/hab'                             # Habitat binary path
node['automate-pkg-cleaner']['hab_pkgs_path'] = '/hab/pkgs'                             # Habitat packages directory
node['automate-pkg-cleaner']['report_path'] = '/var/log/automate-pkg-cleaner/cleanup_report.json'  # Report file

# Version and retry settings
node['automate-pkg-cleaner']['required_automate_version'] = '4.13.290'                 # Minimum Chef Automate version
node['automate-pkg-cleaner']['max_retries'] = 5                                        # Maximum retry attempts
node['automate-pkg-cleaner']['retry_delay_seconds'] = 30                               # Delay between retry attempts

# Package whitelist - protects critical packages from removal
node['automate-pkg-cleaner']['package_whitelist'] = [
  'core/hab',           # Protects ALL versions of core/hab
  'core/hab-sup',       # Protects ALL versions of core/hab-sup  
  'core/hab-launcher'   # Protects ALL versions of core/hab-launcher
]

# Phase 1 cleanup targets - packages to clean with keep_latest
node['automate-pkg-cleaner']['packages_to_cleanup'] = [
  'chef/authn-service',
  'chef/automate-cli',
  'chef/automate-cs-bookshelf',
  'chef/automate-cs-nginx',
  'chef/automate-cs-oc-bifrost',
  'chef/automate-cs-oc-erchef',
  'chef/automate-cs-ocid',
  'chef/deployment-service'
]

Package Whitelist Protection

The cookbook supports flexible whitelist patterns to protect critical packages:

# Protects ALL versions of specified packages
node['automate-pkg-cleaner']['package_whitelist'] = [
  'core/hab',        # All versions of hab
  'core/hab-sup',    # All versions of hab-sup
  'chef/my-service'  # All versions of my-service
]

Exact Version Protection

# Protects only specific versions
node['automate-pkg-cleaner']['package_whitelist'] = [
  'core/hab/1.6.652/20220211034027',           # Only this exact version
  'chef/special-service/1.0.0/20230101000000'  # Only this exact version
]

Mixed Protection

# Combine pattern and exact matching
node['automate-pkg-cleaner']['package_whitelist'] = [
  'core/hab',                                   # All versions (pattern)
  'core/hab-sup',                               # All versions (pattern)
  'chef/special/1.0.0/20230101000000'          # Specific version (exact)
]

Customizing Phase 1 Cleanup

Modify the packages targeted for old version cleanup:

# Add or remove packages as needed
node['automate-pkg-cleaner']['packages_to_cleanup'] = [
  'chef/authn-service',
  'chef/automate-cli',
  'chef/your-custom-package',    # Add your packages
  # Remove any you don't want cleaned
]

How It Works

Phase 1: Targeted Package Cleanup

  1. Processes predefined package list using habitat_package resource
  2. Keeps latest version of each package using keep_latest '1'
  3. Safely removes old versions without breaking dependencies
  4. Logs each package operation for transparency

Phase 2: Dynamic Unused Package Detection

  1. Analyzes running services using hab svc status
  2. Maps dependencies using hab pkg dependencies --transitive for each running service
  3. Builds required package list from all running services and their dependencies
  4. Identifies unused packages by comparing total installed vs required packages
  5. Applies whitelist protection using pattern or exact matching
  6. Removes unused packages with retry logic for dependency conflicts
  7. Generates detailed reports for each cleanup phase

Intelligent Dependency Analysis

The cookbook uses Habitat's built-in dependency analysis:
- Transitive dependencies: Finds all packages required by running services (not just direct dependencies)
- Service-aware: Only considers packages needed by actually running services
- Conflict resolution: Retry logic handles temporary dependency conflicts during removal
- Safety first: Whitelist protection prevents removal of critical system packages

Checking Results

The cookbook generates comprehensive reports for both cleanup phases:

Main Reports:

# Primary cleanup report
cat /var/log/automate-pkg-cleaner/cleanup_report.json

# Consolidated report (if retries occurred)
cat /var/log/automate-pkg-cleaner/cleanup_report_consolidated.json

# Individual retry reports (if applicable)
cat /var/log/automate-pkg-cleaner/cleanup_report_retry_1.json

Report Contents:

Each report includes:
- Phase 1 results: Packages cleaned with habitat_package resource
- Phase 2 analysis: Running services, dependencies, unused packages identified
- Removal results: Successfully removed, failed, not found, and whitelisted packages
- Timing information: Duration of each phase and retry attempt
- Retry details: Which packages required multiple attempts and why

Sample Log Output:

[INFO] Starting cleanup of 8 specific packages using habitat_package resource
[INFO] Completed specific package cleanup using habitat_package resource
[INFO] Starting identification of unused packages
[INFO] Found 25 running services
[INFO] Found 180 required packages (including dependencies)  
[INFO] Found 200 total installed packages
[INFO] Identified 20 unused packages
[INFO] 2 packages were skipped due to whitelist protection
[INFO] Starting removal of 18 packages with retry logic

Understanding the Process

Successful Two-Phase Completion:

Phase 1: Cleaned 8 specific packages with habitat_package resource
Phase 2: Processed 18 unused packages with helper function
=== CLEANUP COMPLETED SUCCESSFULLY ===
All unused packages were successfully removed

Phase 1 Success, Phase 2 Incomplete:

Phase 1: Cleaned 8 specific packages with habitat_package resource  
Phase 2: Processed 20 unused packages with helper function
=== CLEANUP INCOMPLETE ===
5 packages could not be removed after 5 retries
RECOMMENDATION: Run this cookbook again later for complete cleanup

When to Re-run:

  • If you see "CLEANUP INCOMPLETE" messages for Phase 2
  • After other system maintenance that might resolve dependencies
  • Wait some time between runs to allow Habitat's internal cleanup processes
  • Phase 1 is always safe to re-run (uses keep_latest)

The cookbook is designed to be run multiple times safely until all cleanup is complete.

Dependent cookbooks

This cookbook has no specified dependencies.

Contingent cookbooks

There are no cookbooks that are contingent upon this one.

CHANGELOG

[1.0.0] - 2025-08-19

Added

  • Initial release of automate_cleanup cookbook
  • Support for removing Habitat packages listed in CSV file
  • Comprehensive error handling and logging
  • Dry-run mode for testing
  • Package removal verification
  • Backup functionality for current package state
  • Detailed cleanup reporting in JSON format
  • Support for parallel package removal
  • Integration tests with Test Kitchen
  • Unit tests with ChefSpec
  • Example deployment scripts and configurations

Features

  • CSV-based package management: Read package list from CSV file
  • Dry-run mode: Test the cleanup process without actually removing packages
  • Verification: Verify that packages are actually removed after cleanup
  • Backup: Create backup of currently installed packages before cleanup
  • Reporting: Generate detailed JSON reports of cleanup operations
  • Error handling: Continue processing on errors with detailed logging
  • Platform support: Ubuntu, CentOS, RHEL, Amazon Linux
  • Parallel processing: Optional parallel package removal for faster execution

Configuration Options

  • Configurable timeouts for Habitat commands
  • Customizable log levels
  • Flexible file paths for CSV, backups, and reports
  • Option to continue or halt on package removal errors

Testing

  • Comprehensive unit tests with ChefSpec
  • Integration tests with Test Kitchen and InSpec
  • Multiple test suites including dry-run scenarios

Collaborator Number Metric
            

0.1.0 failed this metric

Failure: Cookbook has 0 collaborators. A cookbook must have at least 2 collaborators to pass this metric.

Cookstyle Metric
            

0.1.0 passed this metric

No Binaries Metric
            

0.1.0 passed this metric

Testing File Metric
            

0.1.0 failed this metric

Failure: To pass this metric, your cookbook metadata must include a source url, the source url must be in the form of https://github.com/user/repo, and your repo must contain a TESTING.md file

Version Tag Metric
            

0.1.0 failed this metric

Failure: To pass this metric, your cookbook metadata must include a source url, the source url must be in the form of https://github.com/user/repo, and your repo must include a tag that matches this cookbook version number