First picture @ https://forums.mydigitallife.net/posts/1898322 - it kinda looks like a christmas card. First picture @ https://forums.mydigitallife.net/posts/1877025 - in my mind it looks more realistic, i could buy that one without a second though, except that stupid angle the dogs look at. I'm looking at the pictures with my mind set to "these are real life pictures"... I love what you're doing Yen. Very cool.
Thank you. At my last post I used old prompts to compare the AIs... Now having focus on: "these are real life pictures" to show what's possible. The cool thing is...if you don't like for instance the view of the dogs one can just prompt for it. For instance add:" The dogs are paying attention to the photographer...or: The dogs are looking at..... https://imgur.com/a/gBJzQ8j
Last picture Holding a cup of coffee with two hands while reaching out with a third hand... And from second picture, funny angles etc Spoiler
The longer you watch, the more cursed some of them get. Check what the mirror columns reflect exactly in that wedding picture, check the ornaments, the chairs. Check the bench in the park one (and many other details in the background not making sense at all). Check the left upper part of the snow picture for a really strange hallucination. Etc. These are just cursed images to me.
I posted the images as they were created at first run with all imperfections. No cherry-picking to show the real capabilities of the AI and to show what one can expect. The yield of usable images, is great...the problem rather is to spot imperfections. In case you spot one you simply re-create using a different seed.... and there are tons of settings to optimize. That's the advantage when running AI at home. TBH I often spot mistakes the day after, a year ago mistakes were much more obvious. In case one uses AI outcome for productivity one has to have a real close final look.... When we are looking at such images, we know that they were AI created in advance...this does normally not happen when one shows some pictures. That's a point, too. Another run, different seed: Another run, same seed, different scheduler and sampler: For Carlos:
Hi, during Christmas I did a short AI movie story. It's been a lot of fun. Maybe you like it. For a better experience / quality (if google drive player does distort it and makes it choppy at playing) I suggest to download it and watch locally, speakers on. https://drive.google.com/file/d/19DV1tAR1rWMsz0jetYOlN8FRRpi1IDUb/view?usp=sharing Cya soon again. Yen
Hi old friend. I was really glad to see you being around at MDL! We've known each other for a long time. Thank you! Those are for you. I hope the AI did your flag right, since it's rather a complex flag. https://imgur.com/a/aLEP9ZI Cya. Yen.
Well, it's my hobby a year now. The AI development is insane. Almost every week new stuff comes out. It's hard to find a balance of learning new stuff, developing workflows and being creative finally....but its so much fun. There is a new model which took my attention. I wanna present here. Another lip sync model. The last I posted here was the Natalie Portman speaking...now I took another actress. Version 1: Full head, mood: happy, euphoric, informative: https://drive.google.com/file/d/19omZk-bJgJHmTKxEGOW3wxQgVAEeHTcR/view?usp=drive_link Version 2: Behind a lectern. Mood: Rather shy, reserved... https://drive.google.com/file/d/1iQmPJYd9bC9WaayhgnbzoLWB6rbi5K_7/view?usp=drive_link As always, to download and play it locally prevents the drive player being choppy and distorted. I think it's a noticeable improvement, still not absolutely perfect, but it's getting closer. For 480p resolution only it's quite good. It also can 720p, but that takes still more than 1 hour to render... Cya soon. At February I take a break. Will be on a trip in S/E Asia again. Have a great time. Yen.
That was close (and close)... but alotta throat movements... At around 00:14 ~ish, why did it decide that awkward moment of silence i wonder, that would not be typical to the real person. She has never been that "stupidly out of words" The Natalie Portman speaking was imo better, smaller ofc so might be biased here. Very cool to see indeed, amazing how easy it can be to be fooled. You have a safe trip Yen. Happy New Year.
Happy New Year! Thanks for the comments, always useful to improve how to deal with AI. Let me add some details so you can categorize what is due to 'me', and what is due to the AI (weakness). The videos actually were created using 2 main AIs. The first one is chatterbox. This one I already used at the Natalie video. This AI can mimic speech with a given real audio sample and some own text. So there I write what the person should say. There are a few options how (mood) and also where is a pause. The second is the lip sync AI, which is the new one here. My intention was to have 2 different ways. A confident convincing speech and a more shy, like stage fright appearance. All the more I wanted to challenge the AI, because I know...when there is a pause, previous lip sync AI made the character appearing like sudden death. So the pause you mentioned I have inserted on purpose to have a look what the new AI would do, regardless if the original Angelina would be the same or not. After I had my 2 different audio files using an original Angelina reference I looked for some Angelina images which I could use to test the AI. I have chosen a very close up and a classical situation behind a lectern like a politician. So, the second (new) AI now has to mimic / animate a single still image to a 'talking avatar'. It has to figure how speech is usually translated into natural human facial expressions in a way it appears natural. Finally my criticisms. The AI handled that pause best so far, no dead character. It translated it like a nervous stage fright dropout, or like somebody is not really self confident at that moment. The lip sync still has some flaws. There are simple physics at speech one cannot deny. For instance try to pronounce something like a 'm' without having closed lips... it is actually impossible. Or a 'l' without having your tongue at the palate. Or even simpler an 'a' with closed lips.... When I have a close look there are at least 3 flaws at each video....anyway impressing.
Well, that turned out to be near impossible... never thought about that. It was fun to try Cool, that explains it. Since it was intentionally, the AI handled that pretty okay i guess. It was only the pause i didn't associate with her, if it wasn't for that, i could absolutely buy that performance (and whatever message/propaganda embedded, in reasonable terms to the person ofc) without a second thought. In my mind, if it's not very obviously initially, i don't think i would think to much and i doubt people in general would scrutinize every detail further for flaws with all the media content we are bombarded with everyday. It is impressing indeed. Blessing or curse? I think it's cool. Opposite but complementary forces. Individual exercise of mind and making an opinion on its own goes a long way.
Happy Easter! A new AI to play. Just recognized it as I came back from my trip. It's massive and can audio as well. Full HD is no issue anymore (More to come next weeks...) https://drive.google.com/file/d/1UPopg10wtc7uymaPv3_gwbstkBZvuWvX/view?usp=drive_link Cya, Yen (As always...download it if you want the full quality as the player there is bad...)
This new AI is a huge step forwards.... https://drive.google.com/file/d/1lS0b-u3XtuBlt5mQ4ztze-rsQSO4NsAT/view?usp=drive_link
Hi. This is my biggest AI project so far. I always wanted to write my own music. Thanks to AI, I now have been able to. I firstly wrote my lyrics: Code: [intro] (Vocal whispers, layered with delay) Frankfurt... The Nineties... Omen... [verse] Back in the nineties' electric haze They called me Yen in those golden days I walk these silent streets in the rain Trying to feel that fire again [chorus] Omen, my cathedral built on sweat and sound That was my religion, my sacred ground Every night was a lifetime I couldn't hold Now the story's over, and the memory's cold [verse] I see the faces, ghosts in the strobe Lost in the rhythm, a spinning globe We danced like the future would never break A promise the daylight would always take [chorus] Omen, my cathedral built on sweat and sound That was my religion, my sacred ground Every night was a lifetime I couldn't hold Now the story's over, and the memory's cold [outro] (The beat fades, leaving the string section to hold a final, sorrowful chord before vanishing) They called me Yen... Just a memory now... So cold... Omen had been a club in Frankfurt in the 90s where we celebrated a lot. So I arranged some instruments and created an audio track using Ace Step AI. But then I thought I could create an entire music video from that. I used z-image turbo, Qwen Image edit and Flux2 klein AIs to create the singer, the people and their outfits and the club. (Not the original one, but another live club with a stage). After that I used that great LTXV 2.3 AI to make them being alive. This video represents what is possible with open source AIs, I used those 5 open source AIs to make this video. It was a lot of fun. It also illustrates that open source AI video making is advanced, whereas the sound still needs more time to develop. But it only costs energy for the GPU lol, they are completely free. By creating that video I could get back some old memories when I was young. I am happy to share.... As always download it to have the full quality. It is a 3 minutes full AI video. https://drive.google.com/file/d/18QJd1Gw8psZIm24FqiBh27VB4kYSOLtF/view?usp=sharing Cya again. Yen
What first caught my eye starting from ~ 00:42 (he's frozen in time, on left side) and 01:27 (they are literally in sync) Lovely song, nice tune, I love it.