This website uses Google Analytics and Advertising cookies, used to generate statistical data on how visitors uses this website (with IP anonymization) and to deliver personalized ads based on user prior visits to this website or other websites. [read more]
These cookies are disabled by default until your Accept/Reject Consent.
The Consent can be changed at any time by accesing the ePrivacy button located at the bottom of every page. Enjoy your visit!



How to anonymize strings and emails in PostgreS database

If we need to anonymize some sensible or personal data strings from our PostgreSQL database, then the bellow function will do the work for you. This postgres anonymize function is coded mainly to anonymize strings by replacing middle of the string with asterisks (*) and optionally keep the email domain if required.

String anonymize() examples

SELECT anonymize('email.address@example.com',1,1); returns "e***********************m"
SELECT anonymize('email.address@example.com',2,2); returns "em*********************om"
SELECT anonymize('email.address@example.com',1,1,true); returns "e***********s@example.com"

Clearly enough I hope, basically you need to provide the string that need to be anonymized, how many chars to keep from the string start, how many chars to keep from the string end and an optionally true value to inform the function that your string is an email address and you want to keep the top-level domain from that.

Of course, you can use the function for a display only purpose or to update database tables, eg UPDATE table_clients set email = anonymize(email,1,1,true) where id = 1; or UPDATE table_clients set details = replace('email.address@example.com',anonymize('email.address@example.com',1,1,true));

The FUNCTION anonymize()

CREATE OR REPLACE FUNCTION anonymize(
 i_str text, -- text to be anonymized
 str_start integer, -- no of chars to keep at start
 str_end integer, -- no of chars to keep at end
 str_keep_email_tld boolean default false -- keep email domain
)
RETURNS text AS
$BODY$
DECLARE str_len integer; str_anon text; str_array text[];
BEGIN
 IF str_keep_email_tld THEN
  str_array = string_to_array(i_str,'@');
  i_str = str_array[1];
  str_len = length(i_str);
    str_anon = concat_ws('',substr(i_str,1,str_start),repeat('*',str_len-str_end-str_start),substr(i_str,str_len-str_end+1, str_len),'@'||str_array[2]);
 ELSE 
  str_len = length(i_str);
    str_anon = concat_ws('',substr(i_str,1,str_start),repeat('*',str_len-str_end-str_start),substr(i_str,str_len-str_end+1, str_len));
 END IF;
 RETURN str_anon;
END
$BODY$
  LANGUAGE plpgsql IMMUTABLE
  COST 1;

Using the above Postres custom anonymize function against UNIQUE INDEX columns

Let's suppose you have a user's database table where you keep usernames, maybe in a form of an email address (used as username), and you need to anonymize one or multiple accounts. I assume your username column has a lowercase/uppercase unique index to prevent duplicates. If you run the anonymize function against one or multiple users there is a chance that the anonymized version of two usernames to be the same. So we need another anonymize function to be used for that. Let's name this new function anonymize_unique(). But I will let you to build this function, with some hints: use the table pk column (ID) and generate a md5 hash from the username id column. Or you can go further and keep the email domain from the username email, resulting something like that: v1) a-unique-32char-md5-hash or v2) a-unique-32char-md5-hash@example.com. What do you say, are you in?

That's all folks

label_outline

Share this post on your favorite networks.
© Copyright 2017 b247.eu.org | Just another information technology blog - All Rights Reserved