Lambda Function: S3 Batch Fingerprint Processor

This AWS Lambda function scans all objects in a specified Amazon S3 bucket and generates fingerprints for each. For every object found, it generates a pre-signed URL and invokes the /v1/files/fingerprint API, storing the resulting SHA-256 fingerprint and metadata in the Algorand blockchain.

Code: s3_batch_fingerprint_processor.py

import json
import boto3
import requests

# Constants
FINGERPRINT_API_URL = "https://sandbox.qudefense.com/v1/files/fingerprint"
EXPIRATION_SECONDS = 86400  # 24 hours
BUCKET_NAME = "soumya-qudefense"
PREFIX = ""  # Optional: set to folder path in bucket if needed

HEADERS = {
    "Content-Type": "application/json",
    "qudefense-client-id": "qudefense-outlook-api",
    "qudefense-secret-access-key": "G4ZsFDoS2Px7Jqy6oApxN1_onjxLPoH2-nqQnYXBeqo"
}

def lambda_handler(event=None, context=None):
    s3 = boto3.client("s3")

    try:
        paginator = s3.get_paginator("list_objects_v2")
        page_iterator = paginator.paginate(Bucket=BUCKET_NAME, Prefix=PREFIX)

        for page in page_iterator:
            for obj in page.get("Contents", []):
                key = obj["Key"]
                try:
                    presigned_url = s3.generate_presigned_url(
                        ClientMethod="get_object",
                        Params={"Bucket": BUCKET_NAME, "Key": key},
                        ExpiresIn=EXPIRATION_SECONDS,
                        HttpMethod="GET",
                    )
                    print(f"Presigned URL for {key}: {presigned_url}")
                except Exception as e:
                    print(f"Error generating presigned URL for {key}: {str(e)}")
                    continue

                payload = {"s3_presigned_url": presigned_url}
                try:
                    response = requests.post(FINGERPRINT_API_URL, json=payload, headers=HEADERS)
                    print(f"[{key}] API Response: {response.status_code} - {response.text}")
                except Exception as e:
                    print(f"Error calling fingerprint API for {key}: {str(e)}")

    except Exception as e:
        print(f"Unhandled error in batch Lambda: {str(e)}")

API Invocation

Configuration

The target S3 bucket name must be defined in the Lambda function's environment variables as:

Authentication

The following headers are used for API authentication and are hardcoded in the Lambda function:

Required IAM Policies

Ensure the Lambda function role has the following AWS permissions:

Sample Input

{
  "file_url": "<pre-signed-url>"
}

Example API Response

[
  {
    "filename": "user1/invoice.pdf",
    "status": "success",
    "fingerprint": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
    "metadata_hash": "7c4a8d09ca3762af61e59520943dc26494f8941b",
    "file_metadata": {
      "file_name": "invoice.pdf",
      "file_size": 12345,
      "file_type": "application/pdf",
      "modified_date": "2024-01-01T12:00:00Z"
    },
    "blockchain_app_id": "123456789",
    "blockchain_txid": "ABCDEF123456789",
    "file_id": "550e8400-e29b-41d4-a716-446655440000"
  },
  {
    "filename": "user2/report.docx",
    "status": "success",
    "fingerprint": "2c26b46b68ffc68ff99b453c1d304134134b93b30d38a1c508c6f8fc34c64a3c",
    "metadata_hash": "4c93d709cf3bb2a123ec1f29899309a81fc948cc",
    "file_metadata": {
      "file_name": "report.docx",
      "file_size": 84720,
      "file_type": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
      "modified_date": "2024-02-15T09:30:00Z"
    },
    "blockchain_app_id": "987654321",
    "blockchain_txid": "XYZ123456789DEF",
    "file_id": "1b4e28ba-2fa1-11d2-883f-0016d3cca427"
  },
  {
    "filename": "archive/backups/data.json",
    "status": "success",
    "fingerprint": "c157a79031e1c40f85931829bc5fc552",
    "metadata_hash": "dbfdcaf44d1bd82aaca4e354230135d1",
    "file_metadata": {
      "file_name": "data.json",
      "file_size": 3270,
      "file_type": "application/json",
      "modified_date": "2024-03-21T17:45:00Z"
    },
    "blockchain_app_id": "567890123",
    "blockchain_txid": "LMNOP0987654321",
    "file_id": "7f204f21-8891-4b2d-a346-5f00e8cd6b98"
  }
]