As with so many other things in my life - this started with me asking myself a question - "how would I do that"? This quest began after watching videos where ChatGPT was used in automating content creation. However, the existing methods involved manual intervention - posing questions, copying responses, and pasting them into another online service to generate videos. This wasn't my vision; I wanted a "generate content" button, requiring minimal input from me, beyond initiating the process.
I was really interested in figuring out how to solve it, and descided to get started. First thing to figure out was what kind of content I wanted to create. I settled on daily dad jokes - why? Well, that remains a mystery.
For those eager to jump straight to the results, I've compiled a playlist on my YouTube channel featuring all the jokes (about 180) I generated. Additionally, I've shared a video describing the process on my channel.
In order to automate my dad-jokes the way I wanted to, I figured that I needed a few things:
- A service to generate the jokes
- Local storage for the jokes
- A service to generate voicelines
- A video background generator
- Automatic upload to YouTube
- Custom software to seamlessly integrate everything
A quick search led me to various services and potential solutions for each requirement..
Crafting a Plan
The first challenge was generating the dad jokes. Through a series of test queries to ChatGPT, I found a format that allowed for easy parsing of responses - an approach that proved effective, most of the time.
You:
Give me 2 jokes (dad-jokes if possible) with "tech" as theme. Format the result as json using this template: { "jokes": [{theme:"", "part1": "","part2": "","part3": "" }] }
ChatGPT:
Certainly! Here are 2 jokes with a tech theme in the requested JSON format:
{
"jokes": [
{
"theme": "tech",
"part1": "Why don't programmers like nature?",
"part2": "It has too many bugs!",
"part3": ""
},
{
"theme": "tech",
"part1": "Why did the computer go to the doctor?",
"part2": "Because it had a virus!",
"part3": ""
}
]
}
The formatting aspect proved to be crucial, serving as a foundation for later code parsing of the jokes. I also needed API access to ChatGPT, and a brief search assured me that this wouldn't pose a challenge.
Moving on, the next requirement was generating voicelines. After exploring various AI voice generator services, I settled on play.ht. What caught my attention was its free tier access to an API, and after conducting successful tests in a preliminary project, I found it to be efficient, eliminating the need for further exploration.
As for the remaining components - video background generation, YouTube uploading, and the custom software - I assumed these tasks would be straightforward and didn't delve deeper into them.
The plan
With these considerations in mind, the plan took shape:
- Develop a small C# project (given my preference for C# and the availability of example code for API access in C#).
- Use ChatGPT to supply me with dad-jokes formatted for easy parsing.
- Save the jokes in a local SQLite database, implementing checks for duplicates and other considerations.
- Utilize the Play.ht API to generate voicelines and download corresponding files.
- Create a random custom background and text with FFMpegCore
- Merge the custom background and text with the voicelines.
- Upload to YouTube
Implementing
The journey commenced with the creation of a fundamental console application in Visual Studio.
namespace ai_shorts
{
class Program
{
static async Task Main(string[] args)
{
System.Diagnostics.Debug.WriteLine($"Main(args: {String.Join(",", args)})");
VideoGenerator theThing = new VideoGenerator();
System.Diagnostics.Debug.WriteLine($"Init all the things");
if (!await theThing.Init(args))
{
Console.WriteLine("Init failed :(");
return;
}
System.Diagnostics.Debug.WriteLine($"Generate video");
if (!theThing.GenerateVideo())
{
Console.WriteLine("GenerateVideo failed :(");
return;
}
System.Diagnostics.Debug.WriteLine($"Publish video");
if (!await theThing.PublishVideo())
{
Console.WriteLine("PublishVideo failed :(");
return;
}
Console.WriteLine("Done!");
}
}
}
I like to keep things in the older style and also add a bunch of debug info everywhere. Anyway - now for some more complicated things.
ChatGPT and dad-jokes
There was a couple of different NuGet packages to access OpenAI/ChatGPT but I ended up using Betalgo.OpenAI
and you can read more about it at https://github.com/betalgo/openai. I basically used their example code and modified it to suit my needs. Here is a part of the code that actually asked ChatGPT for jokes. There is a lot more code surrounding this, but I'm not going to give away the full code of this project.
var chatCompletionResult = await openAiService.ChatCompletion.CreateCompletion(new ChatCompletionCreateRequest
{
Messages = new List<ChatMessage> {
ChatMessage.FromUser("Hi ChatGPT!" +
"Could you please give me 2 jokes (dad-jokes if possible) of type "DadJoke", "SoftwareDevelopment", "Tech", or "Gaming"? Try to use one of the four types, but if you cannot figure out the type, use "DadJoke" as default. " +
"It would be a big help if you format your response as json using this template: { "jokes": [{type:"", "part1": "","part2": "","part3": "" }] } and respond with the json only. " +
"And don't tell me that you cannot respond in JSON format - I know you can and I believe in you!")
},
Model = Models.ChatGpt3_5Turbo,
// MaxTokens = 50 //optional
});
if (chatCompletionResult.Successful)
{
System.Diagnostics.Debug.WriteLine("chatResult: " + chatCompletionResult.Choices.First().Message.Content);
string chatGPTResponse = chatCompletionResult.Choices.First().Message.Content;
Regex regex = new Regex(@"{(.|s)*}", RegexOptions.Multiline);
Match match = regex.Match(chatGPTResponse);
if (match.Success)
{
string jsonString = match.Groups[0].Value;
System.Diagnostics.Debug.WriteLine("jsonString: " + jsonString);
var data = JsonConvert.DeserializeObject<MyJokes>(jsonString);
return data?.Jokes;
}
else
{
Console.WriteLine(chatGPTResponse);
}
}
As you can see, I had to encurrage ChatGPT after it suddenly stopped responding in json format. There was a few other strange responses and basically - it was not consistent at all. But in the end I got good responses most of the time.
SQLite
After coding and testing with ChatGPT for a while, I realized that I needed a mechanism to address duplicates. ChatGPT was not the best source of jokes since it exhibited a tendency to recycle a specific set of jokes, occasionally introducing a handful of new ones. So I just implemented a small SQLite database that not only stored the different jokes and checked for duplicates, I also decided to save the generated voicelines there in order to not ask for the same voiceline multiple times (when something went wrong). Here is a glimpse into the basic structure I used:
SQLiteCommand cmd;
cmd = conn.CreateCommand();
cmd.CommandText = "CREATE TABLE IF NOT EXISTS Jokes (Part1 VARCHAR(128), Part2 VARCHAR(128), Part3 VARCHAR(128), Type VARCHAR(32), Status INTEGER, PublishNumber INTEGER)";
cmd.ExecuteNonQuery();
cmd.CommandText = "CREATE UNIQUE INDEX IF NOT EXISTS IX_Joke ON Jokes (Part1, Part2, Part3)";
cmd.ExecuteNonQuery();
cmd.CommandText = "CREATE TABLE IF NOT EXISTS Voicelines (TextToRead VARCHAR(128), Voice VARCHAR(32), Filename VARCHAR(128), Duration REAL)";
cmd.ExecuteNonQuery();
cmd.CommandText = "CREATE UNIQUE INDEX IF NOT EXISTS IX_Voiceline ON Voicelines (TextToRead, Voice)";
cmd.ExecuteNonQuery();
Play.HT
Now that the joke from ChatGPT is stored in the local database, it's time for some voicelines. First I had to create an account and also request an API secret key. The Play.HT API was well documented, but did not contain any C# examples at the time. Anyway - it was not that difficult to write something that worked for my application:
internal class PayloadData
{
[JsonProperty(PropertyName = "text")]
internal string Text { get; set; }
[JsonProperty(PropertyName = "voice")]
internal string Voice { get; set; }
internal PayloadData(string text, string voice)
{
Text = text;
Voice = voice;
}
}
...
internal async Task<bool> GenerateVoiceover(string text, string voice)
{
PayloadData payloadData = new PayloadData(text, voice);
string json = JsonConvert.SerializeObject(payloadData);
var data = new StringContent(CleanText(json), Encoding.UTF8, "application/json");
data.Headers.ContentType = new MediaTypeHeaderValue("application/json");
using (var client = new HttpClient())
{
client.DefaultRequestHeaders.Add("AUTHORIZATION", $"Bearer {_apiSecretKey}");
client.DefaultRequestHeaders.Add("X-USER-ID", $"{_apiUsername}");
client.DefaultRequestHeaders.Add("accept", "text/event-stream");
var stream = await client.PostAsync(_apiEndpoint, data).Result.Content.ReadAsStreamAsync();
using (var reader = new StreamReader(stream))
{
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
if (!String.IsNullOrEmpty(line))
{
// Ignore these lines
if (!line.Contains('{'))
{
line = "";
}
if (line.Equals("event: generating"))
{
Console.Write(".");
line = "";
}
// Keep only the json response
// Example line: data: {"id":"1n2AuS0Acz7iMmkVcH","progress":1,"stage":"complete","url":"https://peregrine-results.s3.amazonaws.com/pigeon/1n2AuS0Acz7iMmkVcH_0.mp3","duration":2.368,"size":49005}
if (line.StartsWith("data: {"))
{
line = line.Replace("data: {", "{");
}
}
// Deserialize server sent event and handle it
if (!String.IsNullOrEmpty(line))
{
ApiEvent? apiEvent = JsonConvert.DeserializeObject<ApiEvent>(line);
if (apiEvent != null)
{
if (!String.IsNullOrEmpty(apiEvent.ErrorMessage))
{
Console.WriteLine("Error: " + apiEvent.ErrorMessage);
return false;
}
else
{
// Example line: {"id":"1n2AuS0Acz7iMmkVcH","progress":1,"stage":"complete","url":"https://peregrine-results.s3.amazonaws.com/pigeon/1n2AuS0Acz7iMmkVcH_0.mp3","duration":2.368,"size":49005}
if (apiEvent.Stage.Equals("complete"))
{
Console.Write(Environment.NewLine);
Console.WriteLine("Downloading...");
string downloadUrl = !String.IsNullOrEmpty(apiEvent.Url) ? apiEvent.Url : "";
string localFilePath = $"{_outputDirectory}\{apiEvent.Id}.mp3";
float duration = apiEvent.Duration;
if (String.IsNullOrEmpty(downloadUrl))
{
Console.WriteLine("downloadUrl is null or empty, download aint gonna happen...");
return false;
}
using (var downloadClient = new HttpClient())
{
using (var s = downloadClient.GetStreamAsync(downloadUrl))
{
using (var fs = new FileStream(localFilePath, FileMode.OpenOrCreate))
{
s.Result.CopyTo(fs);
}
}
}
_repository?.InsertVoiceline(text, voice, localFilePath, duration);
}
}
}
}
}
}
}
return true;
}
private string CleanText(string input)
{
string output = input.Replace("'", "'");
return output;
}
Again - this is not the full code, just the important bits. I'm not going to write every single line of code in this article.
Custom video background
I used NuGet packages FFmpegArgs.Executes
and FFMpegCore
to generate two different random animated backgrounds (starfield and growing circles). I first wanted to use actual videos that I filmed in the nature somewhere, but later I descided to use generated videos instead. Honestly - there is just too much stuff going on in there to write about it here, but I might write a separate blogpost about it some day.
Merging audio
I used FFmpeg to merge the audio files with the backgrounds like this:
private void MergeAudio()
{
System.Diagnostics.Debug.WriteLine($"VideoGenerator.MergeAudio()");
Console.WriteLine("Merge video and audio...");
string audioDirectory = @ConfigurationManager.AppSettings["PlayHT_OutputDirectory"] ?? String.Empty;
string videoDirectory = @ConfigurationManager.AppSettings["OutputPath"] ?? String.Empty;
int counter = 0;
// Add silent audio
var render = FFmpegRender.FromArguments(
$"-f lavfi -i anullsrc=channel_layout=stereo:sample_rate=44100 -i {_outputFile} -c:v copy -c:a aac -shortest -y temp.mp4",
new FFmpegRenderConfig()
// .WithFFmpegBinaryPath("path to ffmpeg") // Default from PATH or current dir
.WithWorkingDirectory(System.IO.Path.GetDirectoryName(Assembly.GetEntryAssembly()?.Location))
);
var result = render.Execute();
if (result.ExitCode != 0)
{
throw new Exception($"ERROR: {String.Join(",", result.ErrorDatas)}");
}
// string inputVideoFile = videoDirectory + $"\ai-short-{_joke?.CurrentMaxPublishNumber + 1}.mp4";
string args = $"-i temp.mp4 ";
foreach (Voiceline v in _voicelines)
{
args += $"-i {v.Filename} ";
}
args += "-filter_complex "";
foreach (VideoScriptItem vsi in _videoScript._videoScriptItems)
{
if (vsi.VideoScriptType == VideoScriptItem.ItemType.WriteText)
{
counter++;
int theDelayInMs = (int)(((float)vsi.StartFrame / (float)_videoFps) * 1000) + (_voicelinePaddingFrames * 25 * 2);
args += $"[{counter}]adelay={theDelayInMs}[aud{counter}]; ";
}
}
counter = 0;
foreach (Voiceline v in _voicelines)
{
counter++;
args += $"[aud{counter}]";
}
args += $"amix={counter}:normalize=false,apad[a]; ";
args += "[0:a][a]amerge[a]";
args += "" ";
args += $"-map 0:v -map "[a]" -codec:v copy -codec:a aac -ac 2 -y {_outputFile}";
var render2 = FFmpegRender.FromArguments(
args,
new FFmpegRenderConfig()
// .WithFFmpegBinaryPath("path to ffmpeg") // Default from PATH or current dir
.WithWorkingDirectory(System.IO.Path.GetDirectoryName(Assembly.GetEntryAssembly()?.Location))
);
var result2 = render2.Execute();
if (result2.ExitCode != 0)
{
throw new Exception($"ERROR: {String.Join(",", result2.ErrorDatas)}");
}
}
Upload to YouTube
Finally I got to the part where I just needed to upload to YouTube. The concept was actually pretty easy and google describe how it is done in detail so that was pretty easy. It did involve quite a bit of code. One thing that I found out was that for a free account there was a limit for how many uploads I could do in a day. Basically - 6 uploads is the limit. And that includes possible failures - simply calling the API endpoint costs a number of tokens. After 6 attempts you are of of tokens.
End result
I managed to create a few (180-isch) videos and overall it was a lot of fun figuring the whole thing out. I have a playlist on my YouTube channel with all the jokes. I also made a video describing the process and pushed it to my channel.
While I did have fun figuring all this out, I'm done after my 180 videos. ChatGPT was not a great source of jokes, but returned a lot of duplicates and later started to modify old jokes instead of supplying me with new ones. There where so many duplicates that I had to build a full loop into the program in order to handle them and eventually it got boring. ChatGPT also basically made up jokes by explaning the punchline (technically not the same joke I guess), and let's just say that ChatGPT does not have a sense of humor.
In the end - I had fun, but that's it. No more AI dad-jokes from me.