Probably not, but Yesterday, the Royal Statistical Society provided a talk by someone who is doing a PhD in Large Language Models such as chatGPT. Apparently, the way it works is that it fits the coefficients in a Markov Model, i.e. a model that predicts the next word in the sentence. For example, if the previous four words are "... the spade finesse was ..." it may predict that the next word is marked: 20% likely: 30% unlikely: 30% doomed: 10% based on simple frequencies in the training database. Like the autocomplete in Google Search. Now you generalise this in various ways to allow it to predict the next word in situations where the exact sequence is very rare or non-existent, but where "similar" (in some sense) sentences appear in the training database. This was a bit of a deja-vu for me since we toyed a bit with such models in information theory classes when I was an undergrad in the 1980s. These were simple statistical models such as linear empirical Bayes models, and we would use something like 100 parameters and about 10 k of training data. Such miniature models can generate gibberish that looks like a desired language, but to generate meaningful or even useful text you will obviously need much larger models and more training data. The newest version of GPT has about a trillion parameters and is trained on a petabyte (1E15 bytes) of data. Generalising this, it can predict what answer someone would give to your question if you posted it on Reddit, which was originally the main source. Reddit is quite good as it is largely Q&A and contains a huge diversity of writing styles. Later versions use other sources also, but it is not entirely clear to me how it would use sources like Wikipedia to train a Q&A model.