Duration: 3 lectures
Trainer: Giorgi Lomidze, trainer, 10+ years of experience in tech, including: educational content in web development and artificial intelligence
Instructor on global platforms: Udemy, Skillshare, Packt Publishing
Format: Remote
Schedule: March 10, 12, 17, 20:00-22:00
Lecture 1: Photo/Image Generation with Artificial Intelligence
Part 1: Theory and Fundamentals
Theoretical foundations, how to choose the right aspect ratio and resolution
The meaning of Aspect Ratio by content type:
1:1 - Instagram post, profile photo
16:9 - YouTube thumbnail, banner, desktop wallpaper, presentation
9:16 - Instagram/TikTok Stories, Reels, Shorts
4:3 - Classic format, blog post illustration
21:9 - ultrawide/cinematic format
Resolution levels - SD, HD, 2K, 4K and when do we need which one?
Upscaling - what is it and how to increase quality after generation
Working modes
Text-to-Image - Generate an image from text
Image-to-Image - Transform/edit an existing image
Copyright and Ethics
Commercial use
Training Data and Ethical Issues
Part 2: Overview of tools
Chat interface in ChatGPT. Precise editing, text rendering
Artistic quality leader. Character references, personalization, style control, animation AI Content Generation - Syllabus
Tool description and main functions
GPT Image 1.5
Google's flagship generator. Excellent text
Rendering, multilingual text support, SynthID watermark
Nano Banana Pro
Mid-journey
Discussion of each tool
GPT Image 1.5: UI Demo, Editing, Text Rendering, Style Filters
Nano Banana Pro: Use in Gemini / Flow / Google AI Studios, text rendering in images, multilingual languages, photorealism
Midjourney: Web interface, options (--ar, --style, --v), character reference, customization
Part 3: Practice . Practical session (30 min)
Prompt Engineering Basics
Prompt structure: subject, style, lighting, composition, negative prompt
Running the same prompt in all three tools and comparing the results Participants' task • Create a YouTube thumbnail or social media post using any tool
AI Content Generation - Syllabus
Lecture 2. Video Generation with Artificial Intelligence
Part 1: Theory and Fundamentals
Video AI operating modes
Text-to-Video — Create a video from a text description
Image-to-Video — Animating a static image
Video-to-Video — Style transformation of an existing video
Trends for 2026
Native 4K output, 60 FPS
15-20+ second clips
Audio-visual synchronization - lip-sync, sound effects, music
Multi-shot storytelling - several shots in one generation
Physics simulation - gravity, water, fabric
Limitations: difficult movement, hands, duration limits
Copyright and Ethics
Seedance 2.0 and Hollywood's reaction - copyright issues
The risks of deepfakes
Part 2: Overview of tools
Google's flagship. The best all-rounder. Audio-video together, character consistency, Google Flow integration
Kuaishou's model (Feb 2026). 15 sec video, Omni model - Character consistency, multilingual lip-sync
ByteDance model (Feb 2026). Audio+video simultaneously. Multimodal reference (image+video+audio), lip-sync
Budget alternative for fast results. Free tier available
Tool description and main functions
Veo 3.1
Kling 3.0
Seedance 2.0
Hailuo
*Seedance 2.0 is not widely available at this time, most likely before the sessions begin
Officially launched Review of each instrument
Veo 3.1: Google Flow interface, text-to-video and image-to-video workflow, audio generation
Kling 3.0: 15 sec video, Omni model, @reference system, Motion Control, multilingual lip-sync
Seedance 2.0: @Reference system, multimodal input, multi-shot narratives, lens switch
Hailuo: interface, fast generation, price comparison with other tools
Part 3: Practical Session, Live Demo, and Assignment
Image-to-Video workflow: Creating a video from an image
Creating a multi-shot scenario (Kling)
Participants' task: Create a 5-15 second video clip for social media.
For the media
Lecture 3 Audio Generation - Music, Voice and Sound Effects
Part 1: Theory and Fundamentals
Audio AI Categories
Music Generation (Text-to-Music) - Full songs with vocals
Text-to-Speech (TTS) - Voice Generation and Voice Cloning
Sound Effects - Generate sound effects
Copyright and Licensing
Copyright-safe platforms: which ones are safe for commercial use
Voice Cloning Ethics: Consent, Verification, Deepfake Risks Use Cases
Background music for YouTube videos
Podcast intro/outro
Advertising voiceovers
Part 2: Overview of tools
Tool description and main functions
The industry leader in TTS. Eleven v3 - Emotional TTS, Voice
Cloning, Text-to-Dialogue, 70+ languages, Eleven Music (copyright-cleared), Sound Effects
ElevenLabs
Google AI Studio / Gemini + Lyria 3
Google's audio ecosystem. Lyria 3 - music generation with text/image, 30-second tracks with lyrics and cover art-
With. Lyria RealTime - realtime music generation. SynthID watermark
Suno AI
Music generation leader. Suno v4 - full songs with vocals, genre control, long compositions
Discussion of each tool
ElevenLabs: TTS Demo (v3 model) Voice Cloning workflow, Text-to-Dialogue, Eleven Music, Sound Effects generation
Google Gemini + Lyria 3: Music generation with text and photos
Suno AI: song generation from prompt, lyric writing, style and genre
Control
Part 3: Practical Session (30 min)
Live demo and assignment
Song generation in Suno (prompt → full track)
Music generation in Gemini with Lyria 3 (text and photo prompt)
Voice-over creation in ElevenLabs (model selection, settings, audio tags)
Participants' task: Create a background track + voiceover for your content. An electronic certificate will be issued at the end of the course.


