Handling JSON files with Hive is not always an easy task.
If you need to extract some specific fields from a structured JSON, we have some alternatives:
There are two UDF functions that are usually helpful on this cases: ‘get_json_object’ and ‘json_tuple’. These functions allows you to access json fields from Hive without installing additional libraries.
To navigate JSON structure with get_json_object, the entire JSON file has to be mapped as string.
1) Create the external table as string:
CREATE EXTERNAL TABLE json_table (str string) LOCATION 's3://mybucket/input/jsonserde' ;
2) select field from store.fruit field:
select get_json_object(json_table.str, '$.store.fruit\') as MyField from json_table;