i am sorry, I don't understand what you need exactly. Are you want to find a regular expression that can be used to check if the data is encrypted or not?
If I understood your question well, I don't think you will be able to do a regular expression search on an encrypted document because the applications that you would use to achieve that functionality would still require the document to be in its original form before it is encrypted. Imagine how disastrous it would be if every other application you would lay your hands on could be able to access your encrypted document and perform the regular searches on it? It would beat the whole need of having encrypted the document in the first place.
Maybe you could elaborate more on what you wanted to achieve.
My question is about : Client outsource file storage by encrypting to untrusted clouds. The authorized third parties should able to evaluate a regular expression on an encrypted file stored at a server (the cloud) without decrypting at server side. Once the regular expression derived words match in the encrypted document if any they should be returned to the authorized client or third party. It is about authorized person is searching regular expression (evaluating) derived words(of course these words also should be encrypted if possible) in the server side on encrypted documents.
Example: ab* all the words starting with 'ab' are matching words in the document.
Please see some examples are quoted in the given link.
I know what means regular expression because I taught this subject in computation theory and compiler techniques.
In fact, when an encryption system is designed, it must taking into consideration the difficulty to find a way to find equivalent encryption key that may be used by the third party. Also, the key generators must have a high complexity. so, i think for a good cryptosystem, it is impossible to find regular expression. you may be find regular expression to weak and simple systems.
I think in your case you will need an in house solution that can do regular expression searches by first decrypting the encrypted files then proceeding to do the regular expression search. I am foreseeing a situation where you will have a lot of overheads especially if the operations carried out on the files on the server side are frequent and if the files to be decrypted too are also many.
Would you have like one encryption key for all the files in the cloud or each file will have its own encryption key? Where do you plan to keep the decryption keys for the encrypted files, in the cloud?
I don't know if there are any such like solutions that exist but if I was to come up with an in house solution to achieve that objective, I would do a port forward e.g. from the cloud server to a physical server where I have much physical control so that I have elaborate security measures in place such that the cloud server forwards all traffic for file requests to the physical server where I can then decrypt the encrypted files first before doing search for words, phrases or expressions requested for by users on the client side after which I will return files which meet the requested search expression.
I caution however that this will incur a lot of over heads during the decryption and search operations on your server side and handling of the encryption and decryption keys may also be exploited by adversaries. if not well handled.
It can be done though it needs some more insightful understanding of your business case. Lets wait and see what other answers experts in this domain will have to offer.
In the meantime you could check out this link https://www.gnupg.org/gph/en/manual/x110.html
With homomorphic encryption you may achieve something similar, but in general regular expression matching is too powerful an instrument to design cryptosystems that permit it on the encrypted data without fully destroying the security. In fact with multiple RE you may recover the whole cleartext, if you require a key to perform RE then I think it's easier and better to decrypt then search.
On the other hand if you limit the kind of RE you want to do it is more feasible. Even better is the search via keywords, for this there are practical schemes based on SHE, even if I understand that it is not exactly what you were looking for.
"By definition" it is not possiblble in general sense, but you could try some "approximation". I would try two approaches (first requiring low computational power, second more flexible, but could require more operation than full decryption!)
first: using symmetric encryption with XOR (like RC4 in WEP) you could easiy adapt your search pattern to each pice of data
second: use some kind of homomorphic encryption (or even hashes on short pices of data, similar to signatures on file systems in Android, but on shorter strings)
The second solution (especially with hashes) can leak some data but with some kind of deniability (every expression could cover big data set, mostly random strings).