In Hive, you cannot always use IN and NOT IN the same way you would in PostgreSQL or MySQL. Similar logic is usually implemented with joins instead.
Implementing IN
To replace an IN clause, you can use LEFT SEMI JOIN.
Original query:
1 | SELECT a.key, a.value |
Hive version:
1 | SELECT a.key, a.val |
Implementing NOT IN
Original query:
1 | SELECT a.key, a.value |
In Hive, this can be rewritten with a LEFT OUTER JOIN plus a NULL check:
1 | SELECT t1.key |