Well bucket better with mods

broken image
broken image
broken image

The hash_function depends on the type of the bucketing column.

broken image

(There’s a ‘0x7FFFFFFF in there too, but that’s not that important).

broken image

In general, the bucket number is determined by the expression hash_function(bucketing_column) mod num_buckets. How Hive Distribute the Rows Across the Buckets? | 2 | US | PASEO COSTA DEL SUR | 704 | PR | | zipcodes.recordnumber | untry | zipcodes.city | zipcodes.zipcode | zipcodes.state | If you are using Hive SELECT * FROM zipcodes WHERE state='PR' and zipcode=704 Loading/inserting data into the Bucketing table would be the same as inserting data into the table. To create a Hive table with bucketing, use CLUSTERED BY clause with the column name you wanted to bucket and the count of the buckets. Now let’s see the Advantages of Bucketing