A channel called 10X Income went from zero to 315,000 subscribers with one repeatable strategy: show a face on camera without ever showing your actual face. The videos feature an AI-generated avatar, a ChatGPT script, and b-roll anyone can source in an afternoon. No camera. No studio. No on-screen personality required.
This post walks through every tool and step in the production process so you can replicate it yourself. You need Midjourney (or a Fiverr substitute), D-ID, CapCut, and a b-roll source. That is the full toolkit. Below is the exact workflow, including the Midjourney prompt Alston uses, the ChatGPT prompt structure, and the editing steps inside CapCut.
What You’ll Walk Out With
- The exact Midjourney prompt that generates an avatar looking directly at the camera
- A step-by-step process for animating that avatar with D-ID’s talking head tool
- A ChatGPT prompt format for writing ready-to-paste YouTube scripts
- The b-roll mistake that makes most faceless channels look identical to each other
- How to assemble the final video in CapCut without paying for any editing software
- A Google Trends research method for finding topics people are actively searching right now
- A Fiverr path for anyone who cannot access Midjourney
- Clarity on which income stream fits your actual skills, via finder.platformproof.com
Why the 10X Income Format Works
The 10X Income YouTube channel reached 315,000 subscribers in a relatively short window. That growth is not random. The format succeeds because it removes every friction point that stops most people from starting a YouTube channel: no camera shyness, no on-screen presence, no lighting setup, and no risk of being recognized on the street.
The format is also built to scale. Once you generate an avatar you like, you reuse the exact same image in every video. Viewers recognize the character. Over time the channel builds brand recognition without the creator ever appearing on screen. That consistency compounds.
High-output channels in the make money online niche tend to grow faster than low-output ones, and faceless production is faster than on-camera work once you learn the workflow. The first couple of videos take several hours. By video five or six you cut that time down significantly because the tools become familiar and the production routine becomes automatic.
Step 1: Create Your AI Avatar with Midjourney
The first thing you need is an AI-generated character to serve as the on-screen face of the channel. Alston uses Midjourney for this step. Go to Midjourney and enter the following prompt:
Prompt: lowly style character looking up towards the camera looking toward and facing the camera in a futuristic room at night lowly style
The key detail is that the character should be looking directly toward and facing the camera. Alston repeats that instruction twice in the prompt because Midjourney often interprets “looking up” as looking away from the viewer. Repeating the camera-facing direction pushes the model to produce the result you actually want.
Plan to run the prompt three or four times before you get an image worth keeping. Each generation takes roughly three to four minutes. When you get one you like, right-click and save it to your desktop. That file is your avatar, and you will reuse it in every future video. You never have to redo this step once you have one good image.
If the Lowly-style character does not match your niche or aesthetic, swap the character description. Alston mentions “office worker” or “office employee” as alternatives that still produce good camera-facing results. You can also include background details in the prompt or leave the background plain and swap it out in CapCut during the editing step.
The reason for reusing the same avatar across every upload is that 10X Income’s growth is built on visual consistency. Viewers who see the same character in thumbnail after thumbnail start to associate that face with your channel. It functions like a logo for the content.
Step 2: Animate the Avatar with D-ID
A static image is not a video. D-ID at d-id.com is the tool that closes that gap. It takes your avatar image and generates a talking head video where the character’s mouth, eyes, and facial expressions move in sync with audio. The result looks like a real presenter reading a script on camera.
Log into D-ID and click Create Video. Click Add and upload the avatar image from Midjourney. From there you have two options:
Option A: Type in a script. D-ID reads the script using a synthetic voice and syncs the avatar’s face to the generated audio. On the free tier, clips are limited to a few seconds, but the sync quality is accurate. Alston demonstrates this live in the video and the result is clean.
Option B: Upload your own audio. This is what Alston recommends if you want to improve your monetization odds. YouTube’s review process gives more scrutiny to channels that use synthetic voices throughout. Recording your own voice-over and uploading it into D-ID gives you an animated avatar that is speaking in your actual voice. That combination looks more authentic and passes monetization review more easily than a fully AI-generated production.
The free tier is enough to test the tool and confirm that your avatar animates correctly. To produce full-length YouTube videos, you will need a paid plan. Alston recommends starting on free, seeing what the tool produces, then upgrading once you know you want to commit to the workflow.
Step 3: Write Your Script with ChatGPT
If you choose Option A in D-ID, you need a script before you can generate anything. ChatGPT handles this part of the workflow. Alston uses a prompt structured like this:
Prompt: Act as an expert in YouTube script writing. We are going to create a script of the five best ways to make money in 2024. Include the script only.
The phrase “include the script only” is in there because ChatGPT tends to add outlines, section labels, bold headers, and transition notes on top of the actual script text. Even with that instruction, the model sometimes returns formatted output anyway. If the first response includes formatting you did not ask for, run the prompt again. The second or third attempt usually produces clean prose.
Once you have a clean script, copy the section you want to use and paste it into D-ID’s script field. Click Generate Video. Alston walks through this in real time in the video and demonstrates a talking avatar reading the script with accurate lip sync.
The “five best ways to make money with X” format works well for this niche because it is easy to structure, matches the type of searches people run on YouTube, and gives ChatGPT a clear enough brief to produce usable output on the first or second try. You can substitute any sub-topic into the prompt: “five best ways to make money on Etsy,” “five best ways to earn with affiliate marketing,” and so on. The topic comes from Google Trends, which is Step 6 below.
Step 4: Source B-Roll That Does Not Look Like Everyone Else’s
B-roll is the footage that plays over the top of the talking head. Alston describes it as a way to break up the visual monotony of watching a static avatar read a script. If you look at the 10X Income channel, b-roll appears every few seconds throughout each video.
The mistake that most faceless YouTube channels make is sourcing all of their b-roll from Pexels. Pexels is free and easy to use, which is exactly why every channel ends up with the same clips. Alston points out that a search for “make money” on Pexels returns roughly 1,400 videos, and most creators simply grab the first ones that come up. The result is that the same clips appear across hundreds of different channels. Viewers who have watched more than one faceless YouTube channel in this niche recognize the footage immediately.
Alston’s recommendation is to use a paid subscription like Storyblocks. When you search “make money” on Storyblocks and sort results by Most Recent, you get footage that other creators have not already grabbed and reused dozens of times. Vato is mentioned as another paid b-roll option with a similar benefit.
The practical difference is that paid b-roll makes your channel look fresher than the free alternatives. In a crowded niche where many channels use the same tools and cover the same topics, production differentiation is one of the few things you can control. Fresh b-roll is a small detail that adds up over dozens of videos.
Step 5: Edit Everything Together in CapCut
CapCut is the tool Alston uses to assemble the final video. It is free and has a desktop application, which matters because the desktop version keeps working even when the browser-based version has issues. The desktop app also handles heavier editing tasks without the lag you get in browser tools.
To build the video in CapCut, create a new 16×9 project and drag in your files. The D-ID talking head video goes on layer one. Your b-roll clips go on layer two above it. CapCut includes an Auto Cutout feature that removes the background from your avatar video so the b-roll shows through behind the character. Alston notes this feature works best with real human footage but produces usable results with animated avatars too, depending on how the Midjourney image rendered the background.
You can also position the avatar in a corner of the frame while b-roll fills the rest of the screen, which is a common layout for finance and money content on YouTube. Adjust sizing and positioning, then export at 1080p when you are satisfied with the result.
Alston is straightforward about the time cost on the first video. Learning where each feature lives in CapCut, figuring out layer ordering, and getting the Auto Cutout to behave the way you want all takes longer the first time than you expect. By the second and third video the process moves faster. The goal is to build a production routine that feels automatic by the fifth or sixth video, so that the only decision-making energy goes toward topic selection and scripting, not tool navigation.
Step 6: Find Winning Topics with Google Trends
Knowing how to produce a faceless video is only useful if you know what to make it about. Alston’s research method is Google Trends at trends.google.com. The process takes a few minutes and gives you search data that most creators in the make money online niche skip entirely.
Type a broad keyword into Google Trends and scroll down to the Related Queries section to see what specific questions people are actually searching. Alston demonstrates with “make money online” and surfaces searches like “side hustles from home,” “is Branded Surveys legit,” and “is Inbox Dollars legit.” Each of those is a standalone video topic with real search volume behind it.
He also runs the same search with the keyword “Etsy” and finds “Etsy Christmas items to sell online” as a real trending query. That translates into a clear video title: “The 5 Best Etsy Christmas Items to Sell Online.” The topic is already trending, the search intent is obvious, and the five-item list format matches what ChatGPT handles well in the scripting step.
This research step should come before every video, not just the first one. Topics that are trending right now get more early views because they match active searches. Topics that are evergreen, like “how to make money with Etsy,” stay relevant over longer periods but often grow more slowly. A channel that mixes trending and evergreen topics tends to perform better than one that focuses on only one type.
Once you have a topic from Google Trends, feed it into the ChatGPT script prompt, generate the script, drop it into D-ID, and the rest of the production chain takes over from there.
Not sure which niche or income stream to build your channel around?
The Finder tool matches you to the right online income method based on your existing skills and schedule. Try it free at finder.platformproof.com.
The Fiverr Option When You Cannot Access Midjourney
Midjourney requires a paid subscription and access through Discord. If that creates a barrier, Alston points to Fiverr as a workable substitute. Search “animated person” on Fiverr and you will find sellers who build custom AI avatars for clients.
Alston mentions finding a Fiverr listing offering a personalized animated AI avatar video for around $50. His recommendation is to ask the seller for the avatar image only, not a full produced video, since you will animate it yourself through D-ID anyway. Getting just the image asset should cost less than a full package and gives you exactly what you need to plug into the rest of the workflow.
The most important instruction to give the Fiverr seller is that the character must be facing directly into the camera. D-ID’s talking head animation works best when the eyes and mouth are forward-facing. An avatar looking off at an angle produces misaligned results when animated. Be specific in your Fiverr brief: camera-facing, centered, clear face, neutral or minimal background.
Complete Workflow From Start to Finish
Here is the full production process in the order you run it:
- Open Google Trends and search your niche keyword. Find two or three specific questions in the Related Queries section with rising search volume. Choose one as your topic.
- Go to ChatGPT and use the “act as expert YouTube scriptwriter” prompt to generate a clean script for your topic. Re-run the prompt if the first output includes formatting you did not ask for.
- Open Midjourney and run the avatar prompt until you get a character facing the camera directly. Save the image to your desktop. This step only needs to happen once; you reuse the same avatar going forward.
- Log into D-ID, create a new video, upload the avatar image, and paste in your script (or upload your recorded audio). Click Generate Video and download the result.
- Go to Storyblocks and download b-roll clips that match your topic. Sort by Most Recent to avoid footage that other channels have already reused heavily.
- Open CapCut desktop app. Create a 16×9 project. Place the D-ID talking head video on layer one and your b-roll clips on layer two above it. Use Auto Cutout to remove the avatar background if desired. Adjust sizing and positioning, then export at 1080p.
- Upload to YouTube with a title that matches the specific search query you found in Google Trends. Write the description using the same keywords. Add a custom thumbnail featuring the avatar character.
Honest Drawbacks of This Approach
The faceless YouTube model has real advantages but it is not a shortcut to passive income. Here are the genuine limitations you should know before you start:
Monetization takes longer with AI voices. YouTube’s monetization review team looks more carefully at channels using synthetic voices. Alston specifically recommends recording your own audio instead of using D-ID’s default voice. If you use a synthetic voice for every video, expect a longer path to monetization approval than channels using real human narration.
The first few videos take longer than you think. Alston is upfront about this. Learning five different tools simultaneously while also researching topics and writing scripts adds friction to the early production sessions. Build extra time into your first few videos rather than expecting to publish on day one.
Free b-roll produces generic results. Using Pexels is not a disqualifier, but it means your videos visually match hundreds of other faceless channels in the same niche. If you go the free route, make sure your topic selection and scripting quality are strong enough to compensate for production similarity.
Avatar generation requires iteration. You will not get the right Midjourney image on the first attempt. Budget three to five tries, each taking three to four minutes. Knowing this in advance makes the iteration feel like a normal part of the process rather than a sign that something is wrong.
This is a content business, not a set-it-and-forget-it system. Faceless does not mean effortless. You still research topics, write or review scripts, source b-roll, edit video, and upload on a consistent schedule. The advantage is that you can do all of that without a camera or on-screen presence. The work is still real work, and consistency is still required for the channel to grow.
Find Your X
The faceless YouTube format works across many niches, but starting in the wrong one wastes months of effort. Before you invest time in the production workflow, it helps to confirm whether YouTube is actually the right platform for your skills and situation, or whether your energy would produce faster results through affiliate marketing, digital products, or a different channel entirely.
The Finder tool at finder.platformproof.com walks you through a short quiz and tells you which online income method fits your current skills and schedule. It is free and takes under five minutes.
Frequently Asked Questions
Do I need to show my face to start a faceless YouTube channel?
No. The entire purpose of this workflow is that your face never appears on screen. You use a Midjourney avatar as the on-screen character and either a synthetic voice or your own recorded audio to narrate the content. The 10X Income channel built 315,000 subscribers using this method without the creator’s face appearing in any video.
Is D-ID free to use?
D-ID offers a free tier that limits you to short clips of roughly five seconds or less per video. That is enough to test the tool and confirm how your avatar looks when animated. For full-length YouTube videos, you need a paid plan. Alston recommends testing on the free tier first before committing to a subscription to make sure the tool fits your workflow.
Can I get monetized on YouTube with AI-generated content?
Yes, but YouTube’s monetization review applies more scrutiny to channels that use synthetic voices and fully AI-generated scripts. Alston’s specific recommendation is to record your own audio narration rather than relying on D-ID’s default synthetic voice. Channels with real human voice-overs move through monetization review more smoothly than fully automated productions.
Why does Alston recommend Storyblocks over Pexels for b-roll?
Pexels is free, which means the same clips appear across hundreds of faceless YouTube channels. When viewers see footage they have already seen in other videos, it signals that the channel is templated rather than original. Storyblocks costs money but gives access to newer footage that other creators have not already reused. Sorting by Most Recent on Storyblocks surfaces clips that most channels have not grabbed yet, which helps your videos look distinct.
What if my Midjourney avatar does not face the camera?
Re-run the prompt. Alston repeats “looking toward and facing the camera” twice in the prompt specifically because Midjourney often misinterprets directional instructions. If repeated attempts still produce off-angle results, try adjusting the character description from “Lowly style” to “office worker” or “office employee” and see if that improves the camera orientation. Once you get a good result, you save it once and reuse it indefinitely, so the iteration cost is a one-time expense.
What is the best niche for a faceless YouTube channel?
The make money online niche is what 10X Income uses, and it performs well because advertiser demand is high and search volume is large. Other niches that work for faceless content include personal finance, technology overviews, health and wellness information, and productivity. The key factor is that you want a niche where information-delivery content performs well, since a talking avatar reading a script is primarily an information format. Use Google Trends to confirm that your specific topics have real search volume before you commit to a niche.
Do I have to use ChatGPT for scripts, or can I write them myself?
You can write scripts yourself, and that is often the stronger choice. Your own experience and perspective produce more original output than ChatGPT’s default responses, and originality tends to perform better over the long term. ChatGPT is most useful when you need a first draft quickly or when you are not yet confident in your own scripting ability. Use it to produce a starting draft, then edit the output to match the way you actually speak and think so the final result does not sound generic.
How long does it take to produce one video using this workflow?
Alston is direct about this: the first few videos take a significant amount of time while you are learning each tool. Once you are comfortable with the full production chain, a typical video should come together in a few hours from topic selection through CapCut export. The Midjourney avatar step only happens once. After that, your time goes into scripting, D-ID animation, b-roll sourcing, and editing. Consistent production compresses the time per video as the process becomes routine.
Read Next
If you want to see another faceless video format that requires zero paid subscriptions, this post covers how to create Reddit-style TikTok videos without ElevenLabs or any other paid voice tool.
How to Create Reddit TikTok Videos for Free (No 11Labs Required)
Sources
- 10X Income YouTube channel (315,000+ subscribers) – referenced as a real-world example of the faceless format working at scale
- Midjourney (midjourney.com) – AI image generation tool used to create the channel avatar
- D-ID (d-id.com) – talking head video generation tool that animates still images
- ChatGPT (chat.openai.com) – used for YouTube script generation
- Storyblocks (storyblocks.com) – paid b-roll subscription recommended for fresh, less-reused stock footage
- Pexels (pexels.com) – free stock video platform, mentioned as widely overused in the faceless YouTube niche
- Vato – additional paid b-roll option mentioned in the video
- CapCut (capcut.com) – free video editing software with background removal and a desktop application
- Google Trends (trends.google.com) – used for keyword and topic research before each video
- Fiverr (fiverr.com) – marketplace for outsourcing avatar creation when Midjourney is not accessible
Helping 1 million working adults make their first $3,000 online with the skills they already have. Alston Godbolt, Platform Proof.