Edwin Genego

Install Edwin Genego

Get instant access from anywhere

  • Lightning-fast performance
  • Works offline & on-the-go
  • Native app-like experience

Why is this an app?!

Why not? 🚀

Update Available

A new version is ready to install

Updates include new features and improvements

Edwin Genego
Applied AI Tooling

Composable AI Tool Stack

Targeted augmentation. Measurable outcomes.

A pragmatic suite of AI-augmented capabilities embedded across analysis, authoring, quality, performance and operational pipelines - selectively integrated where leverage is real.

Image Generation, Upscaling & Training Tooling

Internal command-line tools for AI image workflows used in experiments, case studies, and client work.

Related Case Study

These tools were used to build the AI Influencer workflow - generating training datasets, face-swapped images, and multi-angle character views.

View AI Influencer Case Study

View All Tools & Commands

12 commands spanning generation, enhancement, face swap, and utilities

Click to expand
Name Description
generate_fusion Face-swapped image generation with LoRA model, face swap models (Easel/CodePlug), and optional restoration
generate_story Generate narrative story sequences (typically 20 scenes) with consistent character appearance using LoRA models
cast_story_scenes Two-stage absurdist pipeline: Transform realistic story scenes into absurdist satirical versions using Nano Banana. Adds surrealism, exaggeration, meta-humor while maintaining narrative structure.
generate_multiangle Generate rotated views of subjects using Nano Banana for 360° turnarounds and character sheets
generate_lora_images_v2 Scene-aware LoRA training with structured prompts from JSON library. Supports multi-angle turnarounds (8 angles).
generate_model_comparison Generate side-by-side comparisons of same prompt across v2, v3, v4 models. Essential for model evaluation and quality tracking.
generate_v4_images Showcase generation with edwin-avatar-v4 model. Random sampling across categories for portfolio and demonstration.
generate_training_images Legacy training dataset generator. Superseded by generate_lora_images_v2.
ai_editor Targeted image editing with natural language: Make specific changes to images without regenerating entire scenes. LPA-aligned prompts for material upgrades, object additions, lighting adjustments. Interactive + CLI modes.
refine_training_images Enhance training images with Nano Banana while preserving facial identity and features
upscale_ai_image Image enhancement with Nano Banana (professional photography quality) + optional Real-ESRGAN upscaling. Default: enhance-only for maximum detail preservation
enhance_existing_images Batch enhance existing AI-generated images while preserving character identity from custom LoRA training with conservative, balanced, or upscale-only methods
faceswap_story_scenes Apply face swapping to story scene images using Easel Advanced or CodePlug models with optional GFPGAN face restoration
convert_training_images Convert WebP training images to PNG format with quality preservation and transparency handling (RGBA to RGB with white background)
shuffle_training_images Shuffle and randomly rename training images with alphanumeric names for dataset anonymization and preparation
extract_last_frame Extract the last frame or specific timestamp frame from video files using OpenCV for keyframe creation and video workflows
generate_cinematic_video Generate videos with cinematic camera movements using prompt engineering (dolly, pan, crane, orbit, zoom) - no proprietary APIs needed, just smart prompts with your existing video models
apply_visual_effects Apply cinematic visual effects (glow, particles, sparkles, lens flare, film grain, transformations) using img2img prompt engineering - replicates Higgsfield-style effects without third-party APIs

Command Categories

Generation, Enhancement, Training, Face Swap, Utility

Click to expand

Generation

Create new AI images from prompts

Enhancement

Upscale & refine existing images

Training

Dataset prep for LoRA fine-tuning

Face Swap

Identity transfer & face replacement

Utility

Format conversion & preprocessing

Real-World Workflow Examples

End-to-end pipelines with actual costs and processes

View Full Build →

End-to-end pipelines showing actual costs and processes for custom AI character creation

Custom LoRA Training

Training

Train an AI model on a single person (84 photos) to generate consistent character images across any scene or pose

  • Initial LoRA training $2.00
  • Generate 100 base images $0.40
  • Nano Banana scene castingCast character into 100 new environments $3.50
  • V3 model fine-tuning $2.00
  • Complete Pipeline $7.90

Result: Production-ready AI character model trained exclusively on one person

AI Influencer Content

Generation

Create brand-consistent content with custom-trained model, face swapping, and professional enhancement

  • 50 images (custom LoRA) $0.50
  • Face swap (Easel)Replace face with real reference photo $2.50
  • Face restoration (GFPGAN) $0.10
  • Format conversion Free
  • 50 Pro Images $3.10

Result: Portfolio-grade content maintaining perfect identity across scenarios

Narrative Story Sequence

Story

Generate 20-scene visual narrative with consistent character across different settings and angles

  • 20 scene generations $0.20
  • Multi-angle views (8 angles)360° character turnaround $0.28
  • 5 key scene refinements $0.18
  • Video extraction utilities Free
  • Complete Story $0.66

Result: Cohesive visual narrative ready for animation or presentation

Practical Cost Breakdown: From Training to 30s Video

Actual costs from the AI Influencer build

Most people just prompt Sora or Veo directly and get generic results - anyone's face, inconsistent identity, no brand coherence. This breakdown shows what it actually costs to build a custom AI character that looks like you across unlimited scenarios, with character persistence good enough to build a brand around.

The Result

A custom LoRA model that generates your specific face in any scene, plus 50+ portfolio-ready images showing character consistency across professional, lifestyle, travel, and tech scenarios. Then video generation using those trained images as keyframes - not random AI faces.

Phase 1: Foundation

~$7
Custom LoRA Training (84 images) $3.25
Generate 50 production images $0.50
Face swap 50 images (CodePlug) $0.19
Upscale 50 images to 4K $0.50
Scene casting (multi-angle, 20 scenes) $0.70
Visual narrative (33-scene story) $0.33

Portfolio of 50+ character-consistent images across scenarios

Phase 2: 30s Video Creation

$0.50–$18
ByteDance SeeDance Pro $0.50

3× 10s clips @ $0.15-0.50 ea • 1080p • Best value

Minimax Hailuo 02 $1.50

3× 10s clips @ $0.50 ea • 1080p • Good quality

Google Veo 3 Fast $12

4× 8s clips @ $3 ea • 1080p • Native audio

OpenAI Sora 2 $18

3× 10s clips @ $6 ea • Premium quality • Audio sync

All models support image-to-video from your trained character images

Complete custom AI Influencer (API) Build

Custom model + 50 images + 30s video with audio

$7.50–$25

depending on video model

Foundation (one-time): $7.00
1s 5min
Selected: 30s

SeeDance Route

8.5/10

$7.50

$0.05/s

★ Best Value

17× cheaper, 85% quality

Hailuo Route

9/10

$8.50

$0.05/s

★ Quality Leader

Tops benchmarks

Veo 3 Route

9.5/10

$19

$0.40/s

Premium

Native audio

Sora 2 Route

10/10

$25

$0.60/s

Top Tier

Best overall

Quality vs. Cost Efficiency

SeeDance
85%
★★★ Value
Hailuo
90%
★★★ Value
Veo 3 Fast
95%
★ Premium
Sora 2
100%
★ Premium

Quality scores from benchmark tests & user reviews • Hailuo 02 topped Veo 3 in recent comparisons

Foundation cost is one-time • Video generation costs scale with duration

Alternative: Cloud Subscription Services

For 30s video

If you prefer subscription-based tools with polished UIs over API access, here's what 30s of video costs on popular platforms. Note: These don't include custom LoRA training for your specific face.

Higgsfield

$9/mo

For 30s:

Budget models (Hailuo, SeeDance)

$9.00

+20%

✗ No custom face training

Runway

$15/mo

For 30s:

Gen-4 Turbo

$15.00

+64%

✗ Generic characters only

Kling AI

$6.60/mo

For 30s:

Standard quality

$7.00

-7%

✗ No face persistence

Luma Ray2

$10/mo

For 30s:

Ray2 Flash

$10.00

+36%

✗ Random faces each time

The Real Advantage: Custom Face + API Control

Cloud services offer polished UIs but use generic models - you can't train them on YOUR face for character consistency. The API route includes custom LoRA training (50 images of you), giving you a persistent digital twin + automation capabilities. Costs are similar, but the API gives you ownership and creative control.

What is LoRA Training?

LoRA (Low-Rank Adaptation) fine-tunes an AI model on a specific person using 50-100 photos. The result is a custom trigger word (like "TOK") that generates that exact person in any scenario while maintaining their unique features, expressions, and identity. This is how we create AI influencers that look consistently like one real person across thousands of generated images.

What is Scene Casting?

Scene casting uses Nano Banana to take a generated character and "teleport" them into completely different environments while preserving their exact identity. Instead of just upscaling one image, you create diverse training data by placing the same person in European streets, mountain lakes, urban rooftops, Japanese gardens, etc. This dramatically expands dataset variety from a single generation pass.

📐

Layered Prompt Architecture (LPA)

V1 ACTIVE FRAMEWORK

Universal 7-layer prompting framework achieving 90-100% facial likeness for LoRA-based image generation. Separates identity, positioning, environment, lighting, color, camera, and quality into discrete non-interfering layers.

Current Integration: Automated image generation and style casting with edwin-avatar-v4 LoRA model. Framework powers all generation commands, AI editor, and scene casting workflows.

How It Works

LPA is the prompt construction framework used before generation - not post-processing. Every image, edit, and refinement passes through these 7 layers to build the initial prompt that gets sent to the AI model. The framework prevents common AI artifacts by explicitly defining shadows, depth, materials, and realism at prompt time.

Explore the 7 Layers

Click to see detailed layer specifications and anti-artifact rules

Expand
L1

Identity Anchor (TOK Purity)

PURE identity trigger. Only TOK + pose + clothing + camera engagement. NEVER add body type, skin texture, or realism modifiers here - they dilute the LoRA signal.

TOK male, full body standing, face clearly visible looking toward camera with thoughtful expression, contemporary casual attire, both hands visible at sides, exactly two hands, anatomically correct
Include "looking toward camera"
Specify hand count + position
NO "athletic build" or "skin texture"
NO realism tags in this layer
L2

Spatial Positioning

Subject's scale, frame position, and compositional placement. Critical for architectural photography where subject is scale element.

, subject positioned as small scale element (1/12th of frame), standing in lower third of composition, dwarfed by towering architecture
Portrait: 60% of vertical frame
Architectural: 1/12th scale element
L3

Environment & Background

Complete material specification for every surface - walls, floors, ceilings. Add natural imperfections, weathering, texture variation by distance. Prevents vague AI backdrops.

, futuristic curved spiral staircase, polished white marble steps, brushed steel railings, floor-to-ceiling glass walls, fully designed architectural background with complete spatial definition, every surface material specified, natural surface imperfections and subtle weathering, foreground surfaces with visible fine texture detail, background textures naturally compressed by atmospheric perspective, NO miniature model texture appearance
Specify all surface materials
Add weathering & imperfections
NO vague "modern interior"
NO undefined backgrounds
L4

Lighting & Shadow Logic

CRITICAL: Shadows ONLY from identifiable light sources. Prevents phantom shadows from non-existent lights. Add volumetric effects, atmospheric haze, proper shadow gradients.

, architectural interior lighting, even diffused quality, cool white (5000-6000K), shadows cast exclusively from identifiable light sources in scene, NO random shadows from non-existent lights, volumetric light rays visible through atmospheric haze, natural shadow gradients with proper falloff
Define shadow source explicitly
Add volumetric light rays
NO "dramatic lighting" without source
NO random phantom shadows
L5

Color Grading

Color palette, saturation levels, tonal treatment. Keeps color decisions separate from environmental material specs.

, desaturated architectural environment (very low almost monochromatic), white and gray primary tones, minimalist editorial color treatment
Architectural: Desaturated (20-40%)
Portrait: Natural color palette
L6

Camera Technical

Lens, aperture, depth of field gradient, perspective. CRITICAL: Specify "natural DoF gradient" and "NO tilt-shift miniature effect" to prevent model-like appearance.

, ultra-wide angle (14-24mm equivalent), worm's eye view, natural depth of field gradient from foreground to background, subject in primary focus plane with critical sharpness, progressive focus falloff toward distant background, NO tilt-shift miniature effect, correct relative scale between foreground and background elements
"Natural DoF gradient" specified
Progressive focus falloff
NO uniform sharpness everywhere
Avoid tilt-shift effects
L7

Quality & Anti-AI Tags

Photography style, anti-AI modifiers, photographic realism standards. Prevents synthetic AI smoothing and plastic appearance.

, professional architectural photography, high detail, luxury hospitality aesthetic, editorial quality, authentic photographic capture not AI-generated smoothness, natural film grain and sensor noise, no plastic artificial AI smoothing or synthetic polish
"Authentic photographic capture"
"Natural film grain"
"No plastic AI smoothing"
"Genuine lens characteristics"

Core Principles

Layer Separation

Each layer handles ONE concern. Mixing concerns dilutes effectiveness.

Non-Interference

Realism tags in L7 only, NEVER in L1. Keeps TOK signal pure.

Explicit Logic

Shadows from sources, DoF gradients, surface materials - nothing vague.

LPA in the Generation Pipeline

1. Build Prompt

Apply 7 layers

2. Send to Model

Flux-dev LoRA

3. Result

90-100% likeness

LPA defines the prompt structure before generation, preventing artifacts like phantom shadows, tilt-shift effects, and AI smoothing at the source

Prevents AI Artifacts
  • Miniature/tilt-shift effects
  • Phantom shadow casting
  • Synthetic AI polish
Applies To
  • Architectural photography
  • Portrait & lifestyle
  • Product photography
Applied During Generation
  • Initial image generation (all commands)
  • AI editor prompt construction
  • Enhancement & refinement passes
View in AI Workflows Documentation: docs/spec/

Add Targeted AI Capabilities

Not blanket automation - precision augmentation mapped to measurable constraints: cycle-time, quality, throughput, incident lag.