image cleaner

2026-02-04 07:25:30 +00:00 · 2025-09-15 18:32:13 +02:00
parent bcf11e4e11
commit 9960dc5e38
11 changed files with 1247 additions and 0 deletions
--- a/scripts/ex/.gitignore
+++ b/scripts/ex/.gitignore
@@ -0,0 +1,56 @@
+# Python Virtual Environment
+venv/
+env/
+.env
+
+# Python cache files
+__pycache__/
+*.pyc
+*.pyo
+*.pyd
+.Python
+
+# Generated/processed images
+demo_*
+cleaned_*
+comparison_*
+*_cleaned_*
+*_comparison_*
+
+# Processing outputs
+cleaned/
+output/
+results/
+
+# Configuration files (may contain sensitive settings)
+config.json
+*.config.json
+custom_*.json
+
+# Temporary files
+*.tmp
+*.temp
+.DS_Store
+Thumbs.db
+
+# IDE files
+.vscode/
+.idea/
+*.swp
+*.swo
+*~
+
+# Logs
+*.log
+logs/
+
+# Test outputs
+test_*
+sample_output/
+
+# Large source images (uncomment if you don't want to track originals)
+# *.jpg
+# *.jpeg
+# *.png
+# *.tif
+# *.tiff
--- a/scripts/ex/1771-09b-02.jpg
+++ b/scripts/ex/1771-09b-02.jpg
--- a/scripts/ex/1772-07b-02.jpg
+++ b/scripts/ex/1772-07b-02.jpg
--- a/scripts/ex/1772-34-136.jpg
+++ b/scripts/ex/1772-34-136.jpg
--- a/scripts/ex/README.md
+++ b/scripts/ex/README.md
@@ -0,0 +1,211 @@
+# Historical Newspaper Image Cleaning Pipeline
+
+This pipeline automatically cleans and enhances scanned historical newspaper images by reducing noise, improving contrast, and sharpening text for better readability.
+
+## Features
+
+- **Noise Reduction**: Bilateral filtering and non-local means denoising
+- **Contrast Enhancement**: CLAHE and gamma correction
+- **Background Cleaning**: Morphological operations to remove artifacts
+- **Text Sharpening**: Unsharp masking for improved readability
+- **Batch Processing**: Process entire directories efficiently
+- **Interactive Tuning**: Find optimal parameters for your specific images
+- **Before/After Comparisons**: Visual validation of improvements
+
+## Quick Start
+
+### 1. Install Dependencies
+
+```bash
+pip install -r requirements.txt
+```
+
+### 2. Process Single Image
+
+```bash
+python image_cleaner.py input_image.jpg -o cleaned_image.jpg --comparison
+```
+
+### 3. Batch Process Directory
+
+```bash
+python batch_process.py -i newspaper_scans -o cleaned_images
+```
+
+### 4. Interactive Parameter Tuning
+
+```bash
+python config_tuner.py sample_image.jpg
+```
+
+## Usage Examples
+
+### Basic Image Cleaning
+```bash
+# Clean single image with default settings
+python image_cleaner.py 1771-09b-02.jpg
+
+# Clean with specific processing steps
+python image_cleaner.py 1771-09b-02.jpg --steps denoise contrast sharpen
+
+# Create before/after comparison
+python image_cleaner.py 1771-09b-02.jpg -c
+```
+
+### Batch Processing
+```bash
+# Process all JPG files in current directory
+python batch_process.py
+
+# Process specific directory with custom output
+python batch_process.py -i scans/ -o cleaned/
+
+# Use custom configuration
+python batch_process.py --config custom_config.json
+
+# Skip comparison images for faster processing
+python batch_process.py --no-comparisons
+```
+
+### Parameter Tuning
+```bash
+# Start interactive tuning session
+python config_tuner.py sample_image.jpg
+
+# Load existing config for fine-tuning
+python config_tuner.py sample_image.jpg -c existing_config.json
+```
+
+## Configuration
+
+### Default Parameters
+
+The pipeline uses these default parameters optimized for newspaper scans:
+
+```json
+{
+    "bilateral_d": 9,
+    "bilateral_sigma_color": 75,
+    "bilateral_sigma_space": 75,
+    "clahe_clip_limit": 2.0,
+    "clahe_grid_size": [8, 8],
+    "gamma": 1.2,
+    "denoise_h": 10,
+    "morph_kernel_size": 2,
+    "unsharp_amount": 1.5,
+    "unsharp_radius": 1.0,
+    "unsharp_threshold": 0
+}
+```
+
+### Parameter Descriptions
+
+- **bilateral_d**: Neighborhood diameter for bilateral filtering (5-15)
+- **bilateral_sigma_color**: Color space filter strength (50-150)
+- **bilateral_sigma_space**: Coordinate space filter strength (50-150)
+- **clahe_clip_limit**: Contrast limiting for CLAHE (1.0-4.0)
+- **clahe_grid_size**: CLAHE tile grid size [width, height] (4-16)
+- **gamma**: Gamma correction value (0.8-2.0)
+- **denoise_h**: Denoising filter strength (5-20)
+- **morph_kernel_size**: Morphological operation kernel size (1-5)
+- **unsharp_amount**: Unsharp masking strength (0.5-3.0)
+- **unsharp_radius**: Unsharp masking radius (0.5-2.0)
+- **unsharp_threshold**: Unsharp masking threshold (0-10)
+
+### Creating Custom Configurations
+
+1. Generate default config template:
+```bash
+python batch_process.py --create-config
+```
+
+2. Edit `config.json` with your preferred values
+
+3. Use custom config:
+```bash
+python batch_process.py --config config.json
+```
+
+## Processing Pipeline
+
+The image cleaning pipeline applies these steps in sequence:
+
+1. **Noise Reduction**
+   - Bilateral filtering preserves edges while reducing noise
+   - Non-local means denoising removes repetitive patterns
+
+2. **Contrast Enhancement**
+   - CLAHE improves local contrast adaptively
+   - Gamma correction adjusts overall brightness
+
+3. **Background Cleaning**
+   - Morphological operations remove small artifacts
+   - Background normalization reduces paper texture
+
+4. **Sharpening**
+   - Unsharp masking enhances text edges
+   - Preserves fine details while reducing blur
+
+## Interactive Tuning Commands
+
+When using `config_tuner.py`, these commands are available:
+
+- `set <param> <value>` - Adjust parameter value
+- `show` - Display current parameters
+- `test [steps]` - Process with current settings
+- `compare [filename]` - Save before/after comparison
+- `save <filename>` - Save configuration to file
+- `load <filename>` - Load configuration from file
+- `presets` - Show preset configurations
+- `help` - Show detailed help
+- `quit` - Exit tuning session
+
+## Tips for Best Results
+
+### For Light Damage/Noise:
+- Reduce `bilateral_d` to 5-7
+- Lower `denoise_h` to 5-8
+- Use `clahe_clip_limit` around 1.5
+
+### For Heavy Damage/Artifacts:
+- Increase `bilateral_d` to 12-15
+- Raise `denoise_h` to 15-20
+- Use higher `clahe_clip_limit` (3.0-4.0)
+
+### For Faded/Low Contrast Images:
+- Increase `gamma` to 1.3-1.5
+- Raise `clahe_clip_limit` to 3.0+
+- Boost `unsharp_amount` to 2.0+
+
+### For Sharp/High Quality Scans:
+- Focus mainly on `denoise` and `sharpen` steps
+- Skip `background` cleaning if unnecessary
+- Use lighter settings to preserve quality
+
+## File Structure
+
+```
+newspaper_image_cleaner/
+├── image_cleaner.py      # Core processing module
+├── batch_process.py      # Batch processing script
+├── config_tuner.py       # Interactive parameter tuning
+├── requirements.txt      # Python dependencies
+└── README.md            # This documentation
+```
+
+## Troubleshooting
+
+### ImportError: No module named 'cv2'
+Install OpenCV: `pip install opencv-python`
+
+### Memory Issues with Large Images
+The tuner automatically resizes large images. For batch processing of very large images, consider resizing first.
+
+### Poor Results
+Use the interactive tuner to find optimal parameters for your specific image characteristics.
+
+## Performance
+
+- Single 3000x2000 image: ~3-5 seconds
+- Batch processing depends on image size and quantity
+- Interactive tuning uses smaller images for faster feedback
--- a/scripts/ex/batch_process.py
+++ b/scripts/ex/batch_process.py
@@ -0,0 +1,162 @@
+#!/usr/bin/env python3
+"""
+Batch Processing Script for Historical Newspaper Images
+
+Simple script to process multiple images with the newspaper cleaning pipeline.
+Includes progress tracking and error handling.
+"""
+
+import os
+import sys
+import time
+import json
+from pathlib import Path
+from image_cleaner import NewspaperImageCleaner, create_comparison_image
+
+
+def process_batch(input_dir=".", output_dir="cleaned", config_file=None,
+                 create_comparisons=True, file_pattern="*.jpg"):
+    """
+    Process all newspaper images in a directory.
+
+    Args:
+        input_dir: Directory containing input images
+        output_dir: Directory for cleaned images
+        config_file: JSON file with custom parameters
+        create_comparisons: Whether to create before/after comparisons
+        file_pattern: Glob pattern for files to process
+    """
+
+    # Load custom config if provided
+    config = None
+    if config_file and os.path.exists(config_file):
+        with open(config_file, 'r') as f:
+            config = json.load(f)
+        print(f"Loaded custom config from {config_file}")
+
+    # Initialize cleaner
+    cleaner = NewspaperImageCleaner(config)
+
+    # Setup paths
+    input_path = Path(input_dir)
+    output_path = Path(output_dir)
+    output_path.mkdir(exist_ok=True)
+
+    if create_comparisons:
+        comparison_path = output_path / "comparisons"
+        comparison_path.mkdir(exist_ok=True)
+
+    # Find all image files
+    image_files = list(input_path.glob(file_pattern))
+    image_files.extend(input_path.glob("*.jpeg"))
+    image_files.extend(input_path.glob("*.JPG"))
+    image_files.extend(input_path.glob("*.JPEG"))
+
+    if not image_files:
+        print(f"No image files found in {input_dir}")
+        return
+
+    print(f"Found {len(image_files)} images to process")
+    print(f"Output directory: {output_path.absolute()}")
+
+    # Process each image
+    success_count = 0
+    error_count = 0
+    start_time = time.time()
+
+    for i, img_file in enumerate(image_files, 1):
+        print(f"\n[{i}/{len(image_files)}] Processing: {img_file.name}")
+
+        try:
+            # Process image
+            output_file = output_path / f"cleaned_{img_file.name}"
+            processed, original = cleaner.process_image(img_file, output_file)
+
+            # Create comparison if requested
+            if create_comparisons:
+                comp_file = comparison_path / f"comparison_{img_file.name}"
+                create_comparison_image(original, processed, comp_file)
+
+            success_count += 1
+            print(f"✓ Completed: {img_file.name}")
+
+        except Exception as e:
+            error_count += 1
+            print(f"✗ Error processing {img_file.name}: {str(e)}")
+
+    # Summary
+    elapsed_time = time.time() - start_time
+    print(f"\n" + "="*50)
+    print(f"Batch Processing Complete")
+    print(f"{"="*50}")
+    print(f"Successfully processed: {success_count}")
+    print(f"Errors: {error_count}")
+    print(f"Total time: {elapsed_time:.1f} seconds")
+    print(f"Average time per image: {elapsed_time/len(image_files):.1f} seconds")
+    print(f"Output directory: {output_path.absolute()}")
+
+
+def create_sample_config():
+    """Create a sample configuration file for customization."""
+    config = {
+        "bilateral_d": 9,
+        "bilateral_sigma_color": 75,
+        "bilateral_sigma_space": 75,
+        "clahe_clip_limit": 2.0,
+        "clahe_grid_size": [8, 8],
+        "gamma": 1.2,
+        "denoise_h": 10,
+        "morph_kernel_size": 2,
+        "unsharp_amount": 1.5,
+        "unsharp_radius": 1.0,
+        "unsharp_threshold": 0
+    }
+
+    with open("config.json", "w") as f:
+        json.dump(config, f, indent=4)
+
+    print("Created config.json with default parameters.")
+    print("Edit this file to customize processing settings.")
+
+
+if __name__ == "__main__":
+    import argparse
+
+    parser = argparse.ArgumentParser(
+        description="Batch process historical newspaper images",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples:
+  python batch_process.py                    # Process current directory
+  python batch_process.py -i scans -o clean # Process 'scans' folder
+  python batch_process.py --no-comparisons  # Skip comparison images
+  python batch_process.py --config custom.json  # Use custom settings
+        """
+    )
+
+    parser.add_argument("-i", "--input", default=".",
+                       help="Input directory (default: current directory)")
+    parser.add_argument("-o", "--output", default="cleaned",
+                       help="Output directory (default: cleaned)")
+    parser.add_argument("-c", "--config",
+                       help="JSON config file with custom parameters")
+    parser.add_argument("--no-comparisons", action="store_true",
+                       help="Skip creating before/after comparison images")
+    parser.add_argument("--pattern", default="*.jpg",
+                       help="File pattern to match (default: *.jpg)")
+    parser.add_argument("--create-config", action="store_true",
+                       help="Create sample config file and exit")
+
+    args = parser.parse_args()
+
+    if args.create_config:
+        create_sample_config()
+        sys.exit(0)
+
+    process_batch(
+        input_dir=args.input,
+        output_dir=args.output,
+        config_file=args.config,
+        create_comparisons=not args.no_comparisons,
+        file_pattern=args.pattern
+    )
--- a/scripts/ex/config_tuner.py
+++ b/scripts/ex/config_tuner.py
@@ -0,0 +1,291 @@
+#!/usr/bin/env python3
+"""
+Interactive Parameter Tuning Tool for Newspaper Image Cleaning
+
+This tool helps you find optimal parameters for your specific images
+by providing an interactive tuning interface.
+"""
+
+import cv2
+import json
+import numpy as np
+from pathlib import Path
+from image_cleaner import NewspaperImageCleaner
+
+
+class ParameterTuner:
+    """Interactive parameter tuning for image cleaning pipeline."""
+
+    def __init__(self, sample_image_path):
+        """Initialize with a sample image for tuning."""
+        self.original = cv2.imread(str(sample_image_path))
+        if self.original is None:
+            raise ValueError(f"Could not load image: {sample_image_path}")
+
+        # Resize large images for faster processing during tuning
+        height, width = self.original.shape[:2]
+        if height > 1500 or width > 1500:
+            scale = min(1500/height, 1500/width)
+            new_width = int(width * scale)
+            new_height = int(height * scale)
+            self.original = cv2.resize(self.original, (new_width, new_height))
+            print(f"Resized image to {new_width}x{new_height} for faster tuning")
+
+        self.current_params = self._get_default_params()
+        self.cleaner = NewspaperImageCleaner(self.current_params)
+
+    def _get_default_params(self):
+        """Get default parameters as starting point."""
+        return {
+            'bilateral_d': 9,
+            'bilateral_sigma_color': 75,
+            'bilateral_sigma_space': 75,
+            'clahe_clip_limit': 2.0,
+            'clahe_grid_size': (8, 8),
+            'gamma': 1.2,
+            'denoise_h': 10,
+            'morph_kernel_size': 2,
+            'unsharp_amount': 1.5,
+            'unsharp_radius': 1.0,
+            'unsharp_threshold': 0,
+        }
+
+    def update_parameter(self, param_name, value):
+        """Update a single parameter and refresh the cleaner."""
+        if param_name in self.current_params:
+            # Handle special cases
+            if param_name == 'clahe_grid_size':
+                self.current_params[param_name] = (int(value), int(value))
+            else:
+                self.current_params[param_name] = value
+
+            # Update cleaner with new parameters
+            self.cleaner = NewspaperImageCleaner(self.current_params)
+            print(f"Updated {param_name} = {value}")
+
+    def process_with_current_params(self, steps=None):
+        """Process the sample image with current parameters."""
+        if steps is None:
+            steps = ['denoise', 'contrast', 'background', 'sharpen']
+
+        image = self.original.copy()
+
+        # Apply processing steps
+        if 'denoise' in steps:
+            image = self.cleaner.reduce_noise(image)
+
+        if 'contrast' in steps:
+            image = self.cleaner.enhance_contrast(image)
+
+        if 'background' in steps:
+            image = self.cleaner.clean_background(image)
+
+        if 'sharpen' in steps:
+            image = self.cleaner.sharpen_image(image)
+
+        return image
+
+    def create_comparison(self, steps=None):
+        """Create side-by-side comparison with current parameters."""
+        processed = self.process_with_current_params(steps)
+
+        # Create side-by-side comparison
+        height = max(self.original.shape[0], processed.shape[0])
+        comparison = np.hstack([
+            cv2.resize(self.original, (self.original.shape[1], height)),
+            cv2.resize(processed, (processed.shape[1], height))
+        ])
+
+        return comparison
+
+    def save_comparison(self, output_path, steps=None):
+        """Save comparison image to file."""
+        comparison = self.create_comparison(steps)
+        cv2.imwrite(str(output_path), comparison)
+        print(f"Comparison saved to: {output_path}")
+
+    def save_config(self, config_path):
+        """Save current parameters to JSON config file."""
+        # Convert tuple to list for JSON serialization
+        config_to_save = self.current_params.copy()
+        if 'clahe_grid_size' in config_to_save:
+            config_to_save['clahe_grid_size'] = list(config_to_save['clahe_grid_size'])
+
+        with open(config_path, 'w') as f:
+            json.dump(config_to_save, f, indent=4)
+        print(f"Configuration saved to: {config_path}")
+
+    def load_config(self, config_path):
+        """Load parameters from JSON config file."""
+        with open(config_path, 'r') as f:
+            loaded_params = json.load(f)
+
+        # Convert list back to tuple if needed
+        if 'clahe_grid_size' in loaded_params:
+            loaded_params['clahe_grid_size'] = tuple(loaded_params['clahe_grid_size'])
+
+        self.current_params.update(loaded_params)
+        self.cleaner = NewspaperImageCleaner(self.current_params)
+        print(f"Configuration loaded from: {config_path}")
+
+    def interactive_tune(self):
+        """Start interactive tuning session."""
+        print("\n" + "="*60)
+        print("INTERACTIVE PARAMETER TUNING")
+        print("="*60)
+        print("Commands:")
+        print("  set <param> <value>  - Set parameter value")
+        print("  show                 - Show current parameters")
+        print("  test [steps]         - Test current parameters")
+        print("  save <file>          - Save configuration to file")
+        print("  load <file>          - Load configuration from file")
+        print("  compare [file]       - Save comparison image")
+        print("  presets              - Show parameter presets")
+        print("  help                 - Show this help")
+        print("  quit                 - Exit tuning")
+        print("\nParameters you can adjust:")
+        for param in self.current_params:
+            print(f"  {param}")
+
+        while True:
+            try:
+                command = input("\ntuner> ").strip().split()
+                if not command:
+                    continue
+
+                cmd = command[0].lower()
+
+                if cmd == 'quit' or cmd == 'exit':
+                    break
+
+                elif cmd == 'show':
+                    self._show_parameters()
+
+                elif cmd == 'set' and len(command) >= 3:
+                    param = command[1]
+                    try:
+                        value = float(command[2]) if '.' in command[2] else int(command[2])
+                    except ValueError:
+                        value = command[2]
+                    self.update_parameter(param, value)
+
+                elif cmd == 'test':
+                    steps = command[1:] if len(command) > 1 else None
+                    print("Processing with current parameters...")
+                    processed = self.process_with_current_params(steps)
+                    print(f"Processed image shape: {processed.shape}")
+
+                elif cmd == 'save' and len(command) > 1:
+                    self.save_config(command[1])
+
+                elif cmd == 'load' and len(command) > 1:
+                    self.load_config(command[1])
+
+                elif cmd == 'compare':
+                    output = command[1] if len(command) > 1 else "tuning_comparison.jpg"
+                    self.save_comparison(output)
+
+                elif cmd == 'presets':
+                    self._show_presets()
+
+                elif cmd == 'help':
+                    self._show_help()
+
+                else:
+                    print("Unknown command. Type 'help' for available commands.")
+
+            except KeyboardInterrupt:
+                print("\nExiting tuner...")
+                break
+            except Exception as e:
+                print(f"Error: {str(e)}")
+
+    def _show_parameters(self):
+        """Display current parameter values."""
+        print("\nCurrent Parameters:")
+        print("-" * 30)
+        for param, value in self.current_params.items():
+            print(f"  {param:<20} = {value}")
+
+    def _show_presets(self):
+        """Show preset configurations for different image types."""
+        presets = {
+            "light_cleaning": {
+                "bilateral_d": 5,
+                "denoise_h": 5,
+                "clahe_clip_limit": 1.5,
+                "gamma": 1.1,
+                "unsharp_amount": 1.2
+            },
+            "heavy_cleaning": {
+                "bilateral_d": 15,
+                "denoise_h": 15,
+                "clahe_clip_limit": 3.0,
+                "gamma": 1.3,
+                "unsharp_amount": 2.0
+            },
+            "high_contrast": {
+                "clahe_clip_limit": 4.0,
+                "gamma": 1.4,
+                "unsharp_amount": 2.5
+            }
+        }
+
+        print("\nAvailable Presets:")
+        print("-" * 30)
+        for name, params in presets.items():
+            print(f"{name}:")
+            for param, value in params.items():
+                print(f"  {param} = {value}")
+            print()
+
+    def _show_help(self):
+        """Show detailed help information."""
+        help_text = """
+Parameter Descriptions:
+-----------------------
+bilateral_d          : Neighborhood diameter for bilateral filtering (5-15)
+bilateral_sigma_color: Filter sigma in color space (50-150)
+bilateral_sigma_space: Filter sigma in coordinate space (50-150)
+clahe_clip_limit     : Contrast limit for CLAHE (1.0-4.0)
+clahe_grid_size      : CLAHE tile grid size (4-16)
+gamma                : Gamma correction value (0.8-2.0)
+denoise_h            : Denoising filter strength (5-20)
+morph_kernel_size    : Morphological operation kernel size (1-5)
+unsharp_amount       : Unsharp masking amount (0.5-3.0)
+unsharp_radius       : Unsharp masking radius (0.5-2.0)
+unsharp_threshold    : Unsharp masking threshold (0-10)
+
+Tips:
+- Start with small adjustments (±20% of current value)
+- Test frequently with 'compare' command
+- Save working configurations before major changes
+- Use 'test denoise' to test individual steps
+        """
+        print(help_text)
+
+
+def main():
+    """Main function for command line usage."""
+    import argparse
+
+    parser = argparse.ArgumentParser(description="Interactive parameter tuning for newspaper image cleaning")
+    parser.add_argument("image", help="Sample image path for tuning")
+    parser.add_argument("-c", "--config", help="Load initial config from file")
+
+    args = parser.parse_args()
+
+    try:
+        tuner = ParameterTuner(args.image)
+
+        if args.config:
+            tuner.load_config(args.config)
+
+        tuner.interactive_tune()
+
+    except Exception as e:
+        print(f"Error: {str(e)}")
+
+
+if __name__ == "__main__":
+    main()
--- a/scripts/ex/demo.py
+++ b/scripts/ex/demo.py
@@ -0,0 +1,170 @@
+#!/usr/bin/env python3
+"""
+Demo Script for Newspaper Image Cleaning Pipeline
+
+This script demonstrates the cleaning pipeline on the sample images
+and shows the available functionality.
+"""
+
+import sys
+import os
+from pathlib import Path
+
+# Add current directory to Python path
+sys.path.append(str(Path(__file__).parent))
+
+try:
+    from image_cleaner import NewspaperImageCleaner, create_comparison_image
+    import cv2
+    import numpy as np
+    print("✓ All required libraries imported successfully")
+except ImportError as e:
+    print(f"✗ Import error: {e}")
+    print("Please install required packages: pip install -r requirements.txt")
+    sys.exit(1)
+
+
+def demo_single_image(image_path):
+    """Demonstrate processing a single image."""
+    print(f"\n=== Processing Single Image: {image_path} ===")
+
+    if not os.path.exists(image_path):
+        print(f"Image not found: {image_path}")
+        return False
+
+    try:
+        # Initialize cleaner
+        cleaner = NewspaperImageCleaner()
+
+        # Process image
+        output_path = f"demo_cleaned_{Path(image_path).name}"
+        processed, original = cleaner.process_image(image_path, output_path)
+
+        # Create comparison
+        comparison_path = f"demo_comparison_{Path(image_path).name}"
+        create_comparison_image(original, processed, comparison_path)
+
+        print(f"✓ Processed image saved: {output_path}")
+        print(f"✓ Comparison saved: {comparison_path}")
+        return True
+
+    except Exception as e:
+        print(f"✗ Error processing {image_path}: {str(e)}")
+        return False
+
+
+def demo_step_by_step(image_path):
+    """Demonstrate individual processing steps."""
+    print(f"\n=== Step-by-Step Processing: {image_path} ===")
+
+    if not os.path.exists(image_path):
+        print(f"Image not found: {image_path}")
+        return
+
+    try:
+        # Load image
+        original = cv2.imread(image_path)
+        if original is None:
+            print(f"Could not load image: {image_path}")
+            return
+
+        # Resize if too large for demo
+        height, width = original.shape[:2]
+        if height > 1000 or width > 1000:
+            scale = min(1000/height, 1000/width)
+            new_width = int(width * scale)
+            new_height = int(height * scale)
+            original = cv2.resize(original, (new_width, new_height))
+            print(f"Resized to {new_width}x{new_height} for demo")
+
+        cleaner = NewspaperImageCleaner()
+
+        # Process step by step
+        steps = [
+            ('original', original),
+            ('denoised', cleaner.reduce_noise(original.copy())),
+            ('contrast_enhanced', cleaner.enhance_contrast(original.copy())),
+            ('background_cleaned', cleaner.clean_background(original.copy())),
+            ('sharpened', cleaner.sharpen_image(original.copy()))
+        ]
+
+        # Save each step
+        for step_name, image in steps:
+            output_path = f"demo_step_{step_name}_{Path(image_path).name}"
+            cv2.imwrite(output_path, image)
+            print(f"✓ Saved {step_name}: {output_path}")
+
+        print("✓ Individual processing steps completed")
+
+    except Exception as e:
+        print(f"✗ Error in step-by-step processing: {str(e)}")
+
+
+def show_image_info():
+    """Show information about available images."""
+    print("\n=== Available Sample Images ===")
+
+    image_files = []
+    for ext in ['*.jpg', '*.jpeg', '*.JPG', '*.JPEG']:
+        image_files.extend(Path('.').glob(ext))
+
+    if not image_files:
+        print("No image files found in current directory")
+        return []
+
+    for img_file in image_files:
+        try:
+            # Load image to get dimensions
+            img = cv2.imread(str(img_file))
+            if img is not None:
+                height, width = img.shape[:2]
+                file_size = img_file.stat().st_size / (1024*1024)  # MB
+                print(f"  {img_file.name}: {width}x{height} pixels, {file_size:.1f}MB")
+            else:
+                print(f"  {img_file.name}: Could not load")
+        except Exception as e:
+            print(f"  {img_file.name}: Error - {str(e)}")
+
+    return image_files
+
+
+def main():
+    """Main demo function."""
+    print("Historical Newspaper Image Cleaning Pipeline - Demo")
+    print("=" * 55)
+
+    # Show available images
+    image_files = show_image_info()
+
+    if not image_files:
+        print("\nNo images found. Please add some image files to test.")
+        return
+
+    # Select first image for demo
+    sample_image = image_files[0]
+    print(f"\nUsing sample image: {sample_image.name}")
+
+    # Demo single image processing
+    success = demo_single_image(str(sample_image))
+
+    if success:
+        # Demo step-by-step processing
+        demo_step_by_step(str(sample_image))
+
+        print(f"\n=== Demo Complete ===")
+        print("Generated files:")
+        print("  - demo_cleaned_*.jpg (cleaned image)")
+        print("  - demo_comparison_*.jpg (before/after comparison)")
+        print("  - demo_step_*.jpg (individual processing steps)")
+
+        print(f"\nNext steps:")
+        print(f"  - Try: python config_tuner.py {sample_image.name}")
+        print(f"  - Try: python batch_process.py")
+        print(f"  - Adjust parameters in config.json for better results")
+
+    else:
+        print("\nDemo failed. Please check your Python environment and dependencies.")
+
+
+if __name__ == "__main__":
+    main()
--- a/scripts/ex/image_cleaner.py
+++ b/scripts/ex/image_cleaner.py
@@ -0,0 +1,310 @@
+"""
+Historical Newspaper Image Cleaning Pipeline
+
+This module provides functions to clean and enhance scanned historical newspaper images
+by reducing noise, improving contrast, and sharpening text for better readability.
+"""
+
+import cv2
+import numpy as np
+from PIL import Image, ImageEnhance
+import os
+import argparse
+from pathlib import Path
+
+
+class NewspaperImageCleaner:
+    """
+    Image processing pipeline specifically designed for historical newspaper scans.
+    """
+
+    def __init__(self, config=None):
+        """Initialize with default or custom configuration."""
+        self.config = config or self._default_config()
+
+    def _default_config(self):
+        """Default processing parameters optimized for newspaper scans."""
+        return {
+            'bilateral_d': 9,           # Neighborhood diameter for bilateral filter
+            'bilateral_sigma_color': 75,  # Filter sigma in color space
+            'bilateral_sigma_space': 75,  # Filter sigma in coordinate space
+            'clahe_clip_limit': 2.0,    # Contrast limiting for CLAHE
+            'clahe_grid_size': (8, 8),  # CLAHE grid size
+            'gamma': 1.2,               # Gamma correction value
+            'denoise_h': 10,            # Denoising filter strength
+            'morph_kernel_size': 2,     # Morphological operation kernel size
+            'unsharp_amount': 1.5,      # Unsharp masking amount
+            'unsharp_radius': 1.0,      # Unsharp masking radius
+            'unsharp_threshold': 0,     # Unsharp masking threshold
+        }
+
+    def reduce_noise(self, image):
+        """
+        Apply noise reduction techniques to remove speckles and JPEG artifacts.
+
+        Args:
+            image: Input BGR image
+
+        Returns:
+            Denoised image
+        """
+        # Bilateral filter - preserves edges while reducing noise
+        bilateral = cv2.bilateralFilter(
+            image,
+            self.config['bilateral_d'],
+            self.config['bilateral_sigma_color'],
+            self.config['bilateral_sigma_space']
+        )
+
+        # Non-local means denoising for better noise reduction
+        if len(image.shape) == 3:
+            # Color image
+            denoised = cv2.fastNlMeansDenoisingColored(
+                bilateral, None,
+                self.config['denoise_h'],
+                self.config['denoise_h'],
+                7, 21
+            )
+        else:
+            # Grayscale image
+            denoised = cv2.fastNlMeansDenoising(
+                bilateral, None,
+                self.config['denoise_h'],
+                7, 21
+            )
+
+        return denoised
+
+    def enhance_contrast(self, image):
+        """
+        Improve image contrast using CLAHE and gamma correction.
+
+        Args:
+            image: Input BGR image
+
+        Returns:
+            Contrast-enhanced image
+        """
+        # Convert to LAB color space for better contrast processing
+        if len(image.shape) == 3:
+            lab = cv2.cvtColor(image, cv2.COLOR_BGR2LAB)
+            l_channel, a_channel, b_channel = cv2.split(lab)
+        else:
+            l_channel = image
+
+        # Apply CLAHE (Contrast Limited Adaptive Histogram Equalization)
+        clahe = cv2.createCLAHE(
+            clipLimit=self.config['clahe_clip_limit'],
+            tileGridSize=self.config['clahe_grid_size']
+        )
+        l_channel = clahe.apply(l_channel)
+
+        # Reconstruct image
+        if len(image.shape) == 3:
+            enhanced = cv2.merge([l_channel, a_channel, b_channel])
+            enhanced = cv2.cvtColor(enhanced, cv2.COLOR_LAB2BGR)
+        else:
+            enhanced = l_channel
+
+        # Apply gamma correction
+        gamma = self.config['gamma']
+        inv_gamma = 1.0 / gamma
+        table = np.array([((i / 255.0) ** inv_gamma) * 255
+                         for i in np.arange(0, 256)]).astype("uint8")
+        enhanced = cv2.LUT(enhanced, table)
+
+        return enhanced
+
+    def clean_background(self, image):
+        """
+        Remove small artifacts and clean background noise.
+
+        Args:
+            image: Input image
+
+        Returns:
+            Background-cleaned image
+        """
+        # Convert to grayscale for morphological operations
+        if len(image.shape) == 3:
+            gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
+        else:
+            gray = image
+
+        # Morphological opening to remove small noise
+        kernel = np.ones((self.config['morph_kernel_size'],
+                         self.config['morph_kernel_size']), np.uint8)
+
+        # Opening (erosion followed by dilation)
+        opened = cv2.morphologyEx(gray, cv2.MORPH_OPEN, kernel)
+
+        # If original was color, apply the mask
+        if len(image.shape) == 3:
+            # Create a mask and apply it to the original color image
+            mask = opened > 0
+            result = image.copy()
+            result[~mask] = [255, 255, 255]  # Set background to white
+            return result
+        else:
+            return opened
+
+    def sharpen_image(self, image):
+        """
+        Apply unsharp masking to enhance text clarity.
+
+        Args:
+            image: Input image
+
+        Returns:
+            Sharpened image
+        """
+        # Convert to float for processing
+        float_img = image.astype(np.float32) / 255.0
+
+        # Create Gaussian blur
+        radius = self.config['unsharp_radius']
+        sigma = radius / 3.0
+        blurred = cv2.GaussianBlur(float_img, (0, 0), sigma)
+
+        # Unsharp masking
+        amount = self.config['unsharp_amount']
+        sharpened = float_img + amount * (float_img - blurred)
+
+        # Threshold and clamp
+        threshold = self.config['unsharp_threshold'] / 255.0
+        sharpened = np.where(np.abs(float_img - blurred) < threshold,
+                           float_img, sharpened)
+        sharpened = np.clip(sharpened, 0.0, 1.0)
+
+        return (sharpened * 255).astype(np.uint8)
+
+    def process_image(self, image_path, output_path=None, steps=None):
+        """
+        Process a single image through the complete pipeline.
+
+        Args:
+            image_path: Path to input image
+            output_path: Path for output image (optional)
+            steps: List of processing steps to apply (optional)
+
+        Returns:
+            Processed image array
+        """
+        if steps is None:
+            steps = ['denoise', 'contrast', 'background', 'sharpen']
+
+        # Load image
+        image = cv2.imread(str(image_path))
+        if image is None:
+            raise ValueError(f"Could not load image: {image_path}")
+
+        original = image.copy()
+
+        # Apply processing steps
+        if 'denoise' in steps:
+            print(f"Applying noise reduction...")
+            image = self.reduce_noise(image)
+
+        if 'contrast' in steps:
+            print(f"Enhancing contrast...")
+            image = self.enhance_contrast(image)
+
+        if 'background' in steps:
+            print(f"Cleaning background...")
+            image = self.clean_background(image)
+
+        if 'sharpen' in steps:
+            print(f"Sharpening image...")
+            image = self.sharpen_image(image)
+
+        # Save output if path provided
+        if output_path:
+            cv2.imwrite(str(output_path), image)
+            print(f"Processed image saved to: {output_path}")
+
+        return image, original
+
+    def process_directory(self, input_dir, output_dir, extensions=None):
+        """
+        Process all images in a directory.
+
+        Args:
+            input_dir: Input directory path
+            output_dir: Output directory path
+            extensions: List of file extensions to process
+        """
+        if extensions is None:
+            extensions = ['.jpg', '.jpeg', '.png', '.tif', '.tiff']
+
+        input_path = Path(input_dir)
+        output_path = Path(output_dir)
+        output_path.mkdir(parents=True, exist_ok=True)
+
+        for file_path in input_path.iterdir():
+            if file_path.suffix.lower() in extensions:
+                print(f"\nProcessing: {file_path.name}")
+                output_file = output_path / f"cleaned_{file_path.name}"
+
+                try:
+                    self.process_image(file_path, output_file)
+                except Exception as e:
+                    print(f"Error processing {file_path.name}: {str(e)}")
+
+        print(f"\nBatch processing completed. Results in: {output_dir}")
+
+
+def create_comparison_image(original, processed, output_path):
+    """
+    Create a side-by-side comparison image.
+
+    Args:
+        original: Original image array
+        processed: Processed image array
+        output_path: Path to save comparison
+    """
+    # Resize images to same height if needed
+    h1, w1 = original.shape[:2]
+    h2, w2 = processed.shape[:2]
+
+    if h1 != h2:
+        height = min(h1, h2)
+        original = cv2.resize(original, (int(w1 * height / h1), height))
+        processed = cv2.resize(processed, (int(w2 * height / h2), height))
+
+    # Create side-by-side comparison
+    comparison = np.hstack([original, processed])
+    cv2.imwrite(str(output_path), comparison)
+    print(f"Comparison saved to: {output_path}")
+
+
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser(description="Clean historical newspaper images")
+    parser.add_argument("input", help="Input image or directory path")
+    parser.add_argument("-o", "--output", help="Output path")
+    parser.add_argument("-d", "--directory", action="store_true",
+                       help="Process entire directory")
+    parser.add_argument("-c", "--comparison", action="store_true",
+                       help="Create before/after comparison")
+    parser.add_argument("--steps", nargs="+",
+                       choices=['denoise', 'contrast', 'background', 'sharpen'],
+                       default=['denoise', 'contrast', 'background', 'sharpen'],
+                       help="Processing steps to apply")
+
+    args = parser.parse_args()
+
+    cleaner = NewspaperImageCleaner()
+
+    if args.directory:
+        output_dir = args.output or "cleaned_images"
+        cleaner.process_directory(args.input, output_dir)
+    else:
+        output_path = args.output
+        if not output_path:
+            input_path = Path(args.input)
+            output_path = input_path.parent / f"cleaned_{input_path.name}"
+
+        processed, original = cleaner.process_image(args.input, output_path, args.steps)
+
+        if args.comparison:
+            comparison_path = Path(output_path).parent / f"comparison_{Path(args.input).name}"
+            create_comparison_image(original, processed, comparison_path)
--- a/scripts/ex/requirements.txt
+++ b/scripts/ex/requirements.txt
@@ -0,0 +1,5 @@
+opencv-python==4.10.0.84
+scikit-image==0.24.0
+Pillow==10.4.0
+numpy==2.1.1
+matplotlib==3.9.2
--- a/scripts/ex/run.sh
+++ b/scripts/ex/run.sh
@@ -0,0 +1,42 @@
+#!/bin/bash
+# Convenience script to run the image cleaning pipeline with virtual environment
+
+# Activate virtual environment
+source venv/bin/activate
+
+# Check if any arguments provided
+if [ $# -eq 0 ]; then
+    echo "Historical Newspaper Image Cleaning Pipeline"
+    echo "Usage examples:"
+    echo "  $0 demo                              # Run demo"
+    echo "  $0 clean image.jpg                   # Clean single image"
+    echo "  $0 batch                             # Process all images in directory"
+    echo "  $0 tune image.jpg                    # Interactive parameter tuning"
+    echo "  $0 python script.py [args]          # Run custom Python script"
+    exit 1
+fi
+
+case "$1" in
+    "demo")
+        python demo.py
+        ;;
+    "clean")
+        shift
+        python image_cleaner.py "$@"
+        ;;
+    "batch")
+        shift
+        python batch_process.py "$@"
+        ;;
+    "tune")
+        shift
+        python config_tuner.py "$@"
+        ;;
+    "python")
+        shift
+        python "$@"
+        ;;
+    *)
+        python "$@"
+        ;;
+esac