RPCã¡ã½ããä¸è¦§ããè¦ãhiveserverã¨hiveserver2ã®éã (ã¨ãã¤ãã§ã«Presto)
æå
ã§ãããããhiveserver2ã«ç§»è¡ãããã¨æãã¾ãã¦ãç§»è¡ããããã«ã¯ shib ãhiveserver2ã«å¯¾å¿ãããªãããªãããã¨ãããã¨ã«ãªãããã§ãã
ã§ãå®è£
ããåã«ã©ãã調ã¹ããããã¨ãããã¨ã§hiveserverã¨hivesever2ã¯ä½ãéãã®ããRPC APIã¡ã½ããã®ä¸è¦§ããè¦ã¦ã¿ããã¨æãã¾ãããã«ã¸ã¥ã¢ã«ï¼
hiveserver
å¤ãããhiveserverã¯Thriftã§æ¥ç¶ãã¦APIãå©ãã¾ãã*.thrift ãã¡ã¤ã«ããããå¤ãã®ãæ°ã«ãªãã¾ããããããã¨ã³ã¼ãçæãã¦ãã¾ãã°çµæ§ããããããã³ã¼ãã(ã©ã®è¨èªã§ã)çæããã¾ããã
ã¡ã½ããã®ä¸è¦§ãåºãã¦ã¿ãã¨ãããªæãã
- execute(query)
- fetchOne()
- fetchN(numRows)
- fetchAll()
- getSchema()
- getThriftSchema()
- getClusterStatus()
- getQueryPlan()
- clean()
ã²ã£ããã¼ã«ããããããã§ãããqueryã¯æååãnumRowsã¯æ°å¤ã§ãããã¨ã¯è¨ãã¾ã§ããªãã§ããããæµãã¨ãã¦ã¯ãããªæãã
- æ¥ç¶ãããããããã« execute(query) ãã
- ã¯ã¨ãªãã©ãå®è¡ããããç¥ãããã£ãã getQueryPlan() ãã
- è¿ã£ã¦ãããã¼ã¿ã®åãç¥ããã getSchema() ãã
- çµæããã§ãããã
- fetchOne() ããã㯠fetchN(rows) ãç¹°ãè¿ã
- ã©ããªã£ããçµäºãªã®ã仿§ãç¡ã
- ã©ãã 0 ã«ã©ã ã® row ãã¿ã¼ããã¼ã¿ã¨ãã¦ããããã
- ããã㯠fetchAll() ãå®è¡ãã
- fetchOne() ããã㯠fetchN(rows) ãç¹°ãè¿ã
- æå¾ã« clean() ãã
ã¤ã¾ã execute() ãå é ã«ãããä»ã®å ¨ã¦ã®å¦ç㯠execute() ããã¯ã¨ãªã«å¯¾ãã¦ã®å¦çã ã¨ããã®ãæé»ã®è©±ã«ãªã£ã¦ãããã¾ãããããããããã©ãããã¿ãããªã
hiveserver2
æ°ããã»ãã® hiveserver2 ãThrift APIãå©ããããè¦ã¦ã¿ã㨠fb303 çã®ä¾åãåãé¤ãã㦠thrift å®ç¾©ã¾ããã¯ããã¶ããã£ããããã
ã§ãåãããã«ãããã¨ã³ã¼ãçæãã¦ã¿ãã¨ãããããªãã ãããã¨ããã®ãè¦ãã¦ãããã¡ã¤ã³ã® TCLIService ã«ã¤ãã¦çæãããã³ã¼ããè¦ã¦ã¿ã¦ãåã¡ã½ããã®å¼æ°ãããããããâ¦â¦ã¨æã£ãããå¿
ã対å¿ãã Req ãæ¸¡ã㦠Resp ãããããã¨ãããããã³ã«ã«ãªã£ãããã ã TCLIService_types ãçºããã¨å¯¾å¿ããã¡ã½ããã«ä½ã渡ãã°ãããããããã
ä¸è¦§ã¯ããã
- OpenSession
- TOpenSessionReq(username, password, configuration)
- TOpenSessionResp(status, serverProtocol, sessionHandle, configuration)
- CloseSession
- TCloseSessionReq(sessionHandle)
- GetInfo
- TGetInfoReq(sessionHandle, infoType)
- TGetInfoResp(status, infoValue)
- ExecuteStatement
- TExecuteStatementReq(sessionHandle, statement, confOverlay)
- TExecuteStatementResp(status, operationHandle)
- GetTypeInfo
- TGetTypeInfoReq(sessionHandle)
- TGetTypeInfoResp(status, operationHandle)
- GetCatalogs
- TGetCatalogsReq(sessionHandle)
- TGetCatalogsResp(status, operationHandle)
- GetSchemas
- TGetSchemasReq(sessionHandle, catalogName, schemaName)
- TGetSchemasResp(status, operationHandle)
- GetTables
- TGetTablesReq(sessionHandle, catalogName, schemaName, tableName, tableTypes)
- TGetTablesResp(status, operationHandle)
- GetTableTypes
- TGetTableTypesReq(sessionHandle)
- TGetTableTypesResp(status, operationHandle)
- GetColumns
- TGetColumnsReq(sessionHandle, catalogName, schemaName, tableName, columnName)
- TGetColumnsResp(status, operationHandle)
- GetFunctions
- TGetFunctionsReq(sessionHandle, catalogName, schemaName, functionName)
- TGetFunctionsResp(status, operationHandle)
- GetOperationStatus
- TGetOperationStatusReq(operationHandle)
- TGetOperationStatusResp(status, operationState)
- GetCancelOperation
- TCancelOperationReq(operationHandle)
- TCancelOperationResp(status)
- CloseOperation
- TCloseOperationReq(operationHandle)
- TCloseOperationResp(status)
- GetResultSetMetadata
- TGetResultSetMetadataReq(operationHandle)
- TGetResultSetMetadataResp(status, schema)
- FetchResults
- TFetchResultsReq(operationHandle, orientation = 0, maxRows)
- TFetchResultsResp(status, hasMoreRows, results)
- ãªãã±: SessionHandle 㨠OperationHandle
- TSessionHandle(sessionId)
- TOperationHandle(operationId, operationType, hasResultSet, modifiedRowCount)
ããã¶ãã¡ã½ãããå¢ãã¦ããããããããããæ´çããã¦ãã¦ããããããããã¡ã°ãã®ç¹å¾´ã¯ sessionHandle ãã operationHandle ããã®åèªããã¡ãã¡ã«åºã¦ããããã«ãªã£ããã¨ãããã§è¤æ°ã»ãã·ã§ã³ã®ç®¡çãããã¨ãããã¨ã ãããããããããã
GetTables ãªã©ã®ã¡ã½ãããå¢ãããããã¾ã§ã¯ "show tables" ãªã©ã execute ãã¦çµæãåãåãã¨ãããã¨ããã£ã¦ãããããããAPIä¸çºã§åå¾ã§ããããã«ãªã£ãã¿ããããã¼ãã¹ããã¼ã
ãã㦠status ã¨ãããã®ããã¡ãã¡ã«ããã®ã§ãã¤ã¾ã ExecuteStatement ããããã¨ã¯é©å½ã« status ãè¦ã¦ãçµäºãããã¨ã確èªããã FetchResults ããã¨ãããã¨ã ãããããã¾ã§ã¯éé²ã« fetch ãã¦çµæãè¿ã£ã¦ããã®ãã²ãããå¾ ã¤ã¨ããå³ããæ¦ãã ã£ããããªããFetchResults ãçµæ(TFetchResultsResp)ã« hasMoreRows ãªãã¦é ç®ãå¢ãã¦ã¡ãã¼ã¹ãããããã§å¤ãªå¤å®ãå ¥ãã¦çµæããã§ãããçµãããã©ããã¨ãèããªãã¦æ¸ãã
å ¨ä½çã«ããAPIã«ãªã£ã¦ããã¨è¨ããã§ããããã»ãã·ã§ã³ã¾ããã®ã·ã¼ã±ã³ã¹ã ã確èªããã°ããå®è£ ã§ãããã
Presto
ãshibã§Prestoã«ã¯ã¨ãªã§ããã便å©ãããï¼ãã¨ãã天使ã®åããè³å ã«èµ·ãã£ãã®ã§ãã¤ãã§ã ããããã£ã¨èª¿ã¹ãã
Presto ã¯åºæ¬çã« HTTP JSON API ã§å種ã®å¦çãå®è¡ãããããã«ã¤ãã¦ã®ããã¥ã¡ã³ãã¯ãã£ã¨çºããã¨ããåå¨ããªãã£ããããªã®ã§ãã¨ããããã½ã¼ã¹ã³ã¼ããç´æ¥è¦ããPrestoã®Javaã®ã½ã¼ã¹ã³ã¼ãã¯ãããããããªãã
該å½ã®ãã³ãã©ã¯ãã®ããã: presto-server/src/main/java/com/facebook/presto/server/*.java
- ExecuteResource.java:@Path("/v1/execute")
- @POST (query, user, catalog, schema)
- NodeResource.java:@Path("/v1/node") :
- @GET
- @GET "failed"
- PeriodicImportJobResource.java:@Path("/v1/import/jobs")
- @GET
- @POST (job, uriInfo)
- @DELETE "{jobid}"
- @GET "{jobid}"
- QueryResource.java:@Path("/v1/query") :
- @GET
- @POST (query, user, source, catalog, schema, userAgent, resourceContext, uriInfo)
- @GET "${queryid}"
- @DELETE "{queryid}"
- @DELETE "stage/{stageid}"
- ShardResource.java:@Path("/v1/shard") :
- @DELETE "{shardUuid}"
- @GET "{shardUuid}"
- StageResource.java:@Path("/v1/stage") :
- @DELETE "{stageid}"
- StatementResource.java:@Path("/v1/statement") :
- @POST (statement, user, source, catalog, schema, userAgent, resourceContext, uriInfo)
- @GET "{queryid}/{token}" (queryId, token, maxWait, uriInfo)
- @DELETE "{queryid}/{token}"
- TaskResource.java:@Path("/v1/task") :
- @GET
- @POST "{taskid}"
- @GET "{taskid}"
- @DELETE "{taskid}"
- @GET "{taskid}/results/{outputid}/{token}"
- @DELETE "{taskid}/results/{outputid}"
ããã¨å°ãªããã§ãã¾ããããããããã§ãããã¨ãããã¨ããããããã㨠execute å©ãã¦ãã¾ã£ã¦ãããè¿ã£ã¦ããæ
å ±ãè¦ã¦ãã®æ¬¡ã®APIãå©ãã ãã§ããããããã¶ã queryid ãè¿ã£ã¦ããã®ããªããªã¹ãåå¾ãã忢ãããã©ããå©ãã°ãããããããããã
ã¾ãå¤ãªãã¨ããªããã°ãã使ãããã ãcurlã§è©¦ãã¦ã¿ãã°ããããªã
ã¾ã¨ã
HTTP APIã¯ãã°ãããï¼ (ããã¼ï¼)
ä½ãRPC APIãå©ããããã ãã©ããã¥ã¡ã³ãããªãï¼*1ã¨ããã¨ãã諦ããããããªãã¦ã³ã¼ãããã£ã¨çºãã¦ã¿ãã¨ã ãããããããããã¹ã¹ã¡ã§ãããã¨ãã話ã§ãããã«ã¸ã¥ã¢ã«ï¼ï¼ï¼ï¼ ããããã¯ã¡ãã£ãã
ãã®ã¨ã³ããªã¯ Hadoop Advent Calendar 2013 ã«åå ãã¦ãã¾ãã
http://qiita.com/advent-calendar/2013/hadoop
*1:hiveserverç³»ãRPCã¾ããã®ããã¥ã¡ã³ãã¯ç¡ãã«çãããã¨æã