All we need is an easy explanation of the problem, so here it is.
In Oracle we can measure text similarity with Jaro-Winkler like the following:
SELECT UTL_MATCH.JARO_WINKLER_SIMILARITY('STACKEXCHANGE', 'STAMPEXCHANGE') MYSTRING FROM DUAL; --98
And it turns out that Teradata has Jaro-Winkler too, as explained here. Unfortunately I just don’t understand the doc and example there.
So far what I can do in Teradata is with EDITDISTANCE:
SELECT EDITDISTANCE('STACKEXCHANGE', 'STAMPEXCHANGE') MYSTRING; --2
So, how to measure text similarity with Jaro-Winkler in Teradata? Could anyone please give me some simple example?
How to solve :
I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.
16.20.24.01 is FU1, FU2 is 16.20.40.01+
This function is not a Scalar function, it’s Table Operator syntax for set processing. You have to get used to it, but then those operators are very poweerful.
SELECT * FROM StringSimilarity ( ON ( SELECT 1 as id, 'STACKEXCHANGE' as a, 'STAMPEXCHANGE' as b -- FROM ... ) PARTITION BY ANY USING ComparisonColumnPairs ('jaro_winkler(a,b) AS jw_dist') Accumulate ('id') ) AS dt
Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂