The Gathering!




Halloween Gift Page

Halloween Gathering has begun! You can read all about it HERE.

Participants:
in my little paws

Petege
Aelin
Llola Lane
emanuela1
Hipshot
prae
Nemesis
BoReddington
EarwenM
panthia
sanbie
GlassyLady
shadow_dancer
Disparate Dreamer



 :ghost: :ghost: :ghost:

Chat Box

Halloween is coming!

McGrandpa

2025-10-10, 01:04:27
Hey Zeus FX, welcome back!Great job to Dark Angel, she swatted the heck outta some gremlins! :peek: :Hi5: :woohoo:

Zeus Fx

2025-10-09, 13:07:22
Hello everyone. It is good to be back

Hipshot

2025-10-02, 08:51:51
 :gday: Sounds like the gremlins have once again broken loose.   Think we need to open the industrial microwaves.   :peek:

Skhilled

2025-10-01, 18:54:22
Okey, dokey. You know how to find me, if you need me.  :gday:

DarkAngel

2025-10-01, 17:18:59
nopers just lost a bit

Skhilled

2025-09-30, 20:07:14
DA, Are you still locked out?

DarkAngel

2025-09-29, 15:34:23
Hope site behaves for a bit.

McGrandpa

2025-09-29, 14:04:22
Don't sound so good, Mary!

McGrandpa

2025-09-29, 14:03:44
My EYES!  My EYES!  Light BRIGHT Light BRIGHT!

DarkAngel

2025-09-27, 17:10:12
I locked me out of admin it would seem lol

Vote for our site! 2025

Vote for our site daily by CLICKING this image:




Then go here: to post your vote.


Awards are emailed when goals are reached:

Platinum= 10,000 votes
Gold= 5,000 votes
Silver= 2,500 votes
Bronze= 1,000 votes
Pewter= 300 votes
Copper= 100 Votes




2025 awards

.

2024 awards
   

Attic Donations

Current thread located within.


All donations are greatly needed, appreciated, and go to the Attic/Realms Server fees and upkeep


Thank you so much.

@ FRM




Shop Our latest items!
Members
Stats
  • Total Posts: 96,707
  • Total Topics: 10,120
  • Online today: 1,194
  • Online ever: 5,532 (March 10, 2025, 02:26:56 AM)
Users Online

Giveaway of the Day

Giveaway of the Day

Is AI Stealing My Art?

Started by parkdalegardener, January 14, 2023, 02:07:44 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

parkdalegardener

In action the diffusion process looks like this. We start with random noise and denoise it one step or iteration at a time. I will start with "car" as a prompt. I give the AI a de-noise logarithm to work with, Euler Ancestral; and say go to work and refine the noise 20 times using Euler a. This would result in a picture of a car, of some type of quality, in some type of style; painted whatever.
Illegitimus non carborundum
don't let the bastards grind you down

parkdalegardener

Let's put this together. With the current AI models you type a description of what you want to see in a somewhat natural language. The prompt. The AI then uses the CLIP to try and figure out just exactly you want. The AI then de-noises the random noise picture it generates, on a pixel by pixel bases; as many times as you have told it to. The "magic" of the AI is that it tries to figure out how to resolve the noise into a coherent image. Pixel by pixel using the CLIP to determine the outcome and try and make sense of the noise. The result constantly changing each iteration depending upon the de-noising logarithm and how pleasant or unpleasant the current iteration is. CLIP's aesthetics score.

This is important. If we allowed this type of computer vision in our cars we would be hitting Ladas because they are not considered as "good" or "pretty" a car as a Lincon or a Porche. A car learned by traditional computer vision is not subject to these types of interpretation. Diffusion models do not contain any representation of the training data in the form of a static image as we know it. They amalgamate all the information on what a car is, how pleasing a particular shape may be, a favourite colour or paint pattern, number of windows. Whatever the CLIP, and the humans scoring a training image; thought was important at that time of classification. That's what defines the outcome of a text to image request. That's what defines the image as it converges upon a solution to your request.

If a single pixel of every image that the LIAON-5B base was trained upon, and a single byte of ascii text information to describe it, and a single byte to log it's position in the database were added all together; we would be into the petabytes of info stored inside the diffusion model. The original model released to the public in August of this year; yep that short a time ago; is 4 gig. Yes sir. 4 gig. There is no way to store 5 billion images, either in whole or in part; in a 4 gigabyte file irregardless of the compression method used. It's not about the images in the model because there are no images in the models.

Prompt your text to image program with no prompt. You will still get an image output. If you don't tell the CLIP what you want it just looks at the noise and desperately tries to figure out if there is a pattern. Just like you do when you see random noise. It uses what ever de-noising you tried for a single step (iteration) and tries again. It has no guidance other than the slightly resolved image of noise and an attempt by the CLIP to resolve it into something "pretty" or "aesthetically pleasing." It will continue doing so till you say stop.

You smart phone does this with the people remover type feature on the camera. Some random person tries to "bunny ear" your selfie by throwing a peace sign behind you as they walk by. Your smart phone AI de-noises the interloper right out of there. Crisis averted. The phone AI never needed an image of no photobomber to remove the photobomber. It projects what it thinks is the correct background to replace the interloper. You decide how well it is doing.
Illegitimus non carborundum
don't let the bastards grind you down

parkdalegardener

That is where a lot of misconception comes from. Diffusion models; text to image generators, are trained differently. They start with an image of a car like CV models but they process it differently. The image is broken down into a pixel map and the training model is told that the image is a car. A noise pattern at the pixel level "diffuses" the image. Like static on an old TV if anyone here remembers that. The AI is told "That's a car" and another round of noise is added to the image and the AI is told "That is a car" and another round of noise ...... Sooner or later all you have is random noise but the AI is still learning "car" till you stop the whole process from looping forever.

The images used to train have to be paired with descriptions of what they are being trained on. Tags. We tag everything we post on line for the most part and those are the training tags for the diffusion model. We post a pic on Farcebook of our new car. An ABC Electric Street Cruiser. If that particular picture was scraped for the diffusion model it would also train on whatever other tags are on that image. If you never tagged the image as your new "car" then it may train the diffusion model that your new car is actually some type of electric navy boat, with wheels; that travels in the city like a streetcar. AI is dumb.

This is where LAION comes in. Large-scale Artificial Intelligence Open Network. They are the folks that scraped the net for image/tag pairs. In the case of the current LAION-5B dataset used to train Stable Diffusion 5.85 Billion. Yes; that's B as in billions of tagged datasets.

The other part of the text to image generation is the text part. CLIP. Contrastive Language-Image Pre-training. Clip is simply a way to use more natural language to train the AI and for people to use more conversational language to obtain a result from it. In addition to the tag it also has an aesthetics score as rated by a person.

You can freely look into the training images and the associated meta data to see if you were part of the training and request removal if you feel any such in formation was obtained incorrectly for inclusion into the model.
Illegitimus non carborundum
don't let the bastards grind you down

parkdalegardener

So, did AI stealing your artwork? Yes, no, and maybe. Is AI stealing your artwork? No and maybe. It has come to my attention that some here are of the impression that AI "prompted art" is simply regurgitating the images, or parts of; that they were trained upon. This is an incorrect assumption. Diffusion doesn't work that way.

Without going into a crash course on machine learning I'm going to try and help you understand how this stuff works and a bit of the history. It starts with autonomous driving and driving assists in your car. Computer Vision is done by training an AI on a whole lot of images of the same thing. They are cropped and scaled into a particular size for training. Usually square. You tell the training what they are with a text description. This is a car. Give the AI a crap load of random car images and that text description. You tell the AI you are training that you are 100% sure that the images are indeed cars. You flip them around a few times, maybe reverse them; possibly even rescale them. No matter what, they are still images of cars suitable for training.

In addition to training the AI on car images only, a group of other non-car images are added into the mix. The text descriptions will say what they are, which is not cars. You tell the AI that these images in the mix are not cars. You are 100% sure of that. These are "control" images. You flip, rotate, scale or whatever the images in the same way as the car images. These are still not cars. The trained AI is then able to recognize a car when is sees one (hopefully) and knows that it isn't a tree or possibly a person.

Where does one get these training images? After all, you need a huge number of car images to train with and a large number of non-car images as well. Google is your friend. Way  more than you might think. A single line of Python or Linux will give you all the images of your criteria that you can ask for. Straight from the Google API. In some respect a training image is in the data set though it has been heavily manipulated. If your image is in such a training dataset, it could be recovered with a lot of work.
Illegitimus non carborundum
don't let the bastards grind you down