基于Httpfs访问HDFS的C++实现
Httpfs是hadoop2.x中hdfs项目的内置应用,基于tomcat和jesery,对外提供完备HDFS操作的RESTful接口,无需安装客户端,可方便实现数据交互,如从windows访问存储在hdfs上的文件。本文通过Httpfs说明文档,实现了一个基于libcurl和jsoncpp的httpfs客户端程序(C++)。
1.准备工作
1.1 编译jsoncpp
jsoncpp下载地址:https://codeload.github.com/open-source-parsers/jsoncpp/zip/master
使用VS2010打开jsoncpp解压文件夹/makefiles/msvc2010/jsoncpp.sln,选择lib_json,设置项目的属性。具体设置为:1)常规里设置配置类型为.lib,使用多字节字符集C/C++->代码生成中的代码生成选择 /MD(release) /MDd(debug)。编译环境必须与我们开发的工程一致!!!
1.2编译libcurl
libcurl下载地址:https://curl.haxx.se/download/curl-7.47.1.tar.gz
打开curl解压目录\projects\Windows\VC10\curl-all.sln ,选择lib_debug和lib_release编译。vs2010引用静态链接失败解决:
1)给工程添加依赖的库:项目->属性->链接器->输入->附加依赖项,把libcurl.lib ws2_32.lib winmm.lib wldap32.lib添加进去(注意,debug配置用libcurld.lib).
2)、加入预编译选项:项目->属性->c/c++ ->预处理器->预处理器,把;BUILDING_LIBCURL;HTTP_ONLY复制进去(注意不要丢了;)
解决方案来自网络“vc2010使用libcurl静态库 遇到连接失败的解决方案”
1.3设置头文件引用
在工程路径下创建一个include目录,将libcurl和jsoncpp中的include文件夹下的文件复制到该include文件夹下,设置为vc++目录引用路径。
2.代码实现
HttpfsClient.H
#pragma once
#include <string>
#include <vector>
using namespace std; typedef struct FileStatus {
__int64 accessTime;
__int64 blocksize;
string group;
__int64 length;
__int64 modificationTime;
string owner;
string pathSuffix;
string permission;
int replication;
string type;
}FileStatus; class CHttpFSClient
{
private:
string m_hostaddr; //http://<HOST>:<PORT>/webhdfs/v1/
string m_username; //i.e. hadoop
long m_timeout;
long m_conntimeout;
public:
enum HTTP_TYPE{GET=,PUT,POST,DEL};
public:
CHttpFSClient(string& hostaddr,string& username);
~CHttpFSClient(void);
bool create(string& local_file,string& rem_file,bool overwrite = false);
bool append(string& local_file,string& rem_file);
bool mkdirs(string& path);
bool rename(string& src,string& dst);
bool del(string& path, bool recursive=false);
bool read(string& rem_file,string& local_file, long offset=, long length=);
bool ls(string& rem_path,vector<FileStatus>& results);
protected:
static size_t fileread_callback(void *ptr, size_t size, size_t nmemb, void *stream);
static size_t filewrite_data(const char *ptr, size_t size, size_t nmemb, void *stream);
static size_t memwrite_data(const char *contents, size_t size, size_t nmemb, string *stream);
static size_t header_callback(const char *ptr, size_t size, size_t nmemb, std::string *stream); void showFileStatus(vector<FileStatus>& results);
};
HttpfsClient.cpp
// HttpfsClient.cpp : 定义控制台应用程序的入口点。
// #include "stdafx.h"
#include "HttpfsClient.h"
#include <assert.h>
#include <stdio.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <curl/curl.h>
#include <json/json.h>
#include <iostream>
#include <fstream>
using namespace std; CHttpFSClient::CHttpFSClient(string& hostaddr,string& username)
{
m_hostaddr = hostaddr;
m_username = username;
m_timeout = ;
m_conntimeout = ;
/* In windows, this will init the winsock stuff */
curl_global_init(CURL_GLOBAL_ALL);
} CHttpFSClient::~CHttpFSClient(void)
{
curl_global_cleanup();
} /*
Create and Write to a File
@param local_file string
@param rem_file string
@param overwirte: ture,false
@return true/false Step 1: Submit a HTTP PUT request without automatically following redirects and without sending the file data.
curl -i -X PUT "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=CREATE [&overwrite=<true|false>][&blocksize=<LONG>][&replication=<SHORT>]
[&permission=<OCTAL>][&buffersize=<INT>]"
The request is redirected to a datanode where the file data is to be written: HTTP/1.1 307 TEMPORARY_REDIRECT
Location: http://<DATANODE>:<PORT>/webhdfs/v1/<PATH>?op=CREATE...
Content-Length: 0
Step 2: Submit another HTTP PUT request using the URL in the Location header with the file data to be written.
curl -i -X PUT -T <LOCAL_FILE> "http://<DATANODE>:<PORT>/webhdfs/v1/<PATH>?op=CREATE..."
The client receives a 201 Created response with zero content length and the WebHDFS URI of the file in the Location header: HTTP/1.1 201 Created
Location: webhdfs://<HOST>:<PORT>/<PATH>
Content-Length: 0
*/
bool CHttpFSClient::create(string& local_file,string& rem_file,bool overwrite)
{
string url = m_hostaddr + rem_file + "?op=CREATE&user.name="+m_username;
if(overwrite) url += "&overwrite=true"; string szheader_buffer;
char* redir_url;
string strredir_url;
long response_code=;
bool curlerr = false; CURL *curl;
CURLcode res; // get a curl handle
curl = curl_easy_init();
if(curl) {
curl_easy_setopt(curl, CURLOPT_PUT, 1L);
curl_easy_setopt(curl, CURLOPT_URL, url.c_str());
curl_easy_setopt(curl, CURLOPT_UPLOAD, 1L);
curl_easy_setopt(curl, CURLOPT_TIMEOUT, m_timeout);
curl_easy_setopt(curl, CURLOPT_CONNECTTIMEOUT, m_conntimeout);
curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 0L);
curl_easy_setopt(curl, CURLOPT_INFILESIZE, ); //上传的字节数 res = curl_easy_perform(curl);
// Check for errors
if(res != CURLE_OK)
{
fprintf(stderr, "hdfs create first request failed: %s\n",
curl_easy_strerror(res));
curlerr = true;
}
else
{
res = curl_easy_getinfo(curl,CURLINFO_REDIRECT_URL,&redir_url);
if(res != CURLE_OK)
{
fprintf(stderr, "curl_easy_getinfo CURLINFO::CURLINFO_REDIRECT_URL failed: %s\n",
curl_easy_strerror(res));
curlerr = true;
}
strredir_url = redir_url;
} }
// always cleanup!!!!
curl_easy_cleanup(curl);
if(curlerr)
return false; //upload file to hdfs
struct stat file_info;
// get the file size of the local file
stat(local_file.c_str(), &file_info);
FILE * hd_src;
hd_src = fopen(local_file.c_str(), "rb");
if(GetLastError() != )
return false; struct curl_slist *headers = NULL;
headers = curl_slist_append(headers, "Content-Type:application/octet-stream");
headers = curl_slist_append(headers, "Content-Type:application/octet-stream");
curl = curl_easy_init();
if(curl) {
// we want to use our own read function
curl_easy_setopt(curl, CURLOPT_READFUNCTION, CHttpFSClient::fileread_callback);
// enable uploading
curl_easy_setopt(curl, CURLOPT_UPLOAD, 1L);
// HTTP PUT please
curl_easy_setopt(curl, CURLOPT_PUT, 1L);
// specify target URL, and note that this URL should include a file name, not only a directory
curl_easy_setopt(curl, CURLOPT_URL, strredir_url.c_str());
// specify content type
curl_easy_setopt(curl, CURLOPT_HTTPHEADER, headers);
// now specify which file to upload
curl_easy_setopt(curl, CURLOPT_READDATA, hd_src);
// provide the size of the upload, we specicially typecast the value to curl_off_t
// since we must be sure to use the correct data size
curl_easy_setopt(curl, CURLOPT_INFILESIZE_LARGE,
(curl_off_t)file_info.st_size); // Now run off and do what you've been told!
res = curl_easy_perform(curl);
// Check for errors
if(res != CURLE_OK)
{
fprintf(stderr, "upload file to hdfs failed: %s\n",
curl_easy_strerror(res));
curlerr = true;
}
}
fclose(hd_src); // close the local file // always cleanup!!!!
curl_slist_free_all(headers);
curl_easy_cleanup(curl);
if(curlerr)
return false; return true;
} /*
Append to a File
@param local_file string
@param rem_file string
@return true/false Step 1: Submit a HTTP POST request without automatically following redirects and without sending the file data.
curl -i -X POST "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=APPEND[&buffersize=<INT>]"
The request is redirected to a datanode where the file data is to be appended: HTTP/1.1 307 TEMPORARY_REDIRECT
Location: http://<DATANODE>:<PORT>/webhdfs/v1/<PATH>?op=APPEND...
Content-Length: 0
Step 2: Submit another HTTP POST request using the URL in the Location header with the file data to be appended.
curl -i -X POST -T <LOCAL_FILE> "http://<DATANODE>:<PORT>/webhdfs/v1/<PATH>?op=APPEND..."
The client receives a response with zero content length: HTTP/1.1 200 OK
Content-Length: 0
*/
bool CHttpFSClient::append(string& local_file,string& rem_file)
{
string url = m_hostaddr + rem_file + "?op=APPEND&user.name="+m_username; char* redir_url;
string strredir_url;
long response_code=;
bool curlerr = false; CURL *curl;
CURLcode res; // get a curl handle
curl = curl_easy_init();
if(curl) {
curl_easy_setopt(curl, CURLOPT_POST, 1L);
curl_easy_setopt(curl, CURLOPT_URL, url.c_str());
curl_easy_setopt(curl, CURLOPT_TIMEOUT, m_timeout);
curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 0L);
curl_easy_setopt(curl, CURLOPT_CONNECTTIMEOUT, m_conntimeout);
curl_easy_setopt(curl, CURLOPT_POSTFIELDSIZE, ); res = curl_easy_perform(curl);
// Check for errors
if(res != CURLE_OK)
{
fprintf(stderr, "hdfs append first request failed: %s\n",
curl_easy_strerror(res));
curlerr = true;
}
else
{
res = curl_easy_getinfo(curl,CURLINFO_REDIRECT_URL,&redir_url);
if(res != CURLE_OK)
{
fprintf(stderr, "curl_easy_getinfo CURLINFO::CURLINFO_REDIRECT_URL failed: %s\n",
curl_easy_strerror(res));
curlerr = true;
}
strredir_url = redir_url;
} }
// always cleanup!!!!
curl_easy_cleanup(curl);
if(curlerr)
return false; // append file to hdfs
struct curl_slist *headers = NULL;
headers = curl_slist_append(headers, "Content-Type: application/octet-stream");
curl = curl_easy_init();
if(curl) {
curl_easy_setopt(curl, CURLOPT_POST, 1L);
curl_easy_setopt(curl, CURLOPT_HTTPHEADER, headers);
curl_easy_setopt(curl, CURLOPT_URL, strredir_url.c_str());
//curl_easy_setopt(curl, CURLOPT_VERBOSE, 1L); /*//multipart/formdata请求
struct curl_httppost *formpost = NULL;
struct curl_httppost *lastptr = NULL;
curl_formadd(&formpost, &lastptr, CURLFORM_COPYNAME, "file", CURLFORM_FILE, local_file.c_str(), CURLFORM_CONTENTTYPE, "application/octet-stream", CURLFORM_END);
curl_easy_setopt(curl, CURLOPT_HTTPPOST, formpost);*/ //C++代码一次读取文本文件全部内容到string对象
ifstream fin(local_file.c_str(),ios::in);
istreambuf_iterator<char> beg(fin), end;
string strdata(beg, end);
fin.close();
curl_easy_setopt(curl,CURLOPT_POSTFIELDS,strdata.c_str()); res = curl_easy_perform(curl);
//curl_formfree(formpost);
// Check for errors
if(res != CURLE_OK)
{
fprintf(stderr, "append file to hdfs failed: %s\n",
curl_easy_strerror(res));
curlerr = true;
}
else
{
res = curl_easy_getinfo(curl,CURLINFO_RESPONSE_CODE,&response_code);
if(res != CURLE_OK)
{
fprintf(stderr, "curl_easy_getinfo CURLINFO::CURLINFO_RESPONSE_CODE failed: %s\n",
curl_easy_strerror(res));
curlerr = true;
}
}
} // always cleanup!!!!
curl_slist_free_all(headers);
curl_easy_cleanup(curl);
if(curlerr)
return false; if(response_code == )
return true;
else
return false;
} /*
Make a Directory Submit a HTTP PUT request.
curl -i -X PUT "http://<HOST>:<PORT>/<PATH>?op=MKDIRS[&permission=<OCTAL>]"
The client receives a response with a boolean JSON object: HTTP/1.1 200 OK
Content-Type: application/json
Transfer-Encoding: chunked {"boolean": true}
*/
bool CHttpFSClient::mkdirs(string& path)
{
string url = m_hostaddr + path + "?op=MKDIRS&user.name="+m_username; long response_code=;
long headerlen = ;
bool curlerr = false;
string response_contents; CURL *curl;
CURLcode res; // get a curl handle
curl = curl_easy_init();
if(curl) {
// http put
curl_easy_setopt(curl, CURLOPT_PUT, 1L);
curl_easy_setopt(curl, CURLOPT_URL, url.c_str());
curl_easy_setopt(curl, CURLOPT_HEADER, 1L);
curl_easy_setopt(curl, CURLOPT_TIMEOUT, m_timeout);
curl_easy_setopt(curl, CURLOPT_CONNECTTIMEOUT, m_conntimeout);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, CHttpFSClient::memwrite_data);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, &response_contents);
curl_easy_setopt(curl, CURLOPT_INFILESIZE, ); res = curl_easy_perform(curl);
// Check for errors
if(res != CURLE_OK)
{
fprintf(stderr, "hdfs mkdirs failed: %s\n",
curl_easy_strerror(res));
curlerr = true;
}
else
{
res = curl_easy_getinfo(curl,CURLINFO_RESPONSE_CODE,&response_code);
if(res != CURLE_OK)
{
fprintf(stderr, "curl_easy_getinfo CURLINFO::CURLINFO_RESPONSE_CODE failed: %s\n",
curl_easy_strerror(res));
curlerr = true;
}
res = curl_easy_getinfo(curl,CURLINFO_HEADER_SIZE,&headerlen);
if(res != CURLE_OK)
{
fprintf(stderr, "curl_easy_getinfo CURLINFO::CURLINFO_HEADER_SIZE failed: %s\n",
curl_easy_strerror(res));
curlerr = true;
}
} }
// always cleanup!!!!
curl_easy_cleanup(curl);
if(curlerr)
return false; if(response_code == )
{
Json::Reader reader;
Json::Value root;
const char *content = response_contents.c_str();
if(!reader.parse(content+headerlen,content+response_contents.length(),root,false))
return false; return root["boolean"].asBool();
}
else
return false;
} /*
Rename a File/Directory
Submit a HTTP PUT request.
curl -i -X PUT "<HOST>:<PORT>/webhdfs/v1/<PATH>?op=RENAME&destination=<PATH>" The client receives a response with a boolean JSON object: HTTP/1.1 200 OK
Content-Type: application/json
Transfer-Encoding: chunked {"boolean": true}
*/
bool CHttpFSClient::rename(string& src,string& dst)
{
string url = m_hostaddr + src + "?op=RENAME&user.name="+m_username+"&destination="+dst; long response_code=;
long headerlen = ;
bool curlerr = false;
string response_contents; CURL *curl;
CURLcode res; // get a curl handle
curl = curl_easy_init();
if(curl) {
// http put
curl_easy_setopt(curl, CURLOPT_PUT, 1L);
curl_easy_setopt(curl, CURLOPT_URL, url.c_str());
curl_easy_setopt(curl, CURLOPT_HEADER, 1L);
curl_easy_setopt(curl, CURLOPT_TIMEOUT, m_timeout);
curl_easy_setopt(curl, CURLOPT_CONNECTTIMEOUT, m_conntimeout);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, CHttpFSClient::memwrite_data);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, &response_contents);
curl_easy_setopt(curl, CURLOPT_INFILESIZE, ); res = curl_easy_perform(curl);
// Check for errors
if(res != CURLE_OK)
{
fprintf(stderr, "hdfs rename failed: %s\n",
curl_easy_strerror(res));
curlerr = true;
}
else
{
res = curl_easy_getinfo(curl,CURLINFO_RESPONSE_CODE,&response_code);
if(res != CURLE_OK)
{
fprintf(stderr, "curl_easy_getinfo CURLINFO::CURLINFO_RESPONSE_CODE failed: %s\n",
curl_easy_strerror(res));
curlerr = true;
}
res = curl_easy_getinfo(curl,CURLINFO_HEADER_SIZE,&headerlen);
if(res != CURLE_OK)
{
fprintf(stderr, "curl_easy_getinfo CURLINFO::CURLINFO_HEADER_SIZE failed: %s\n",
curl_easy_strerror(res));
curlerr = true;
}
} }
// always cleanup!!!!
curl_easy_cleanup(curl);
if(curlerr)
return false; if(response_code == )
{
Json::Reader reader;
Json::Value root;
const char *content = response_contents.c_str();
if(!reader.parse(content+headerlen,content+response_contents.length(),root,false))
return false; return root["boolean"].asBool();
}
else
return false;
} /*
Delete a File/Directory
@param file string, the file or directory to be deleted
@return ture/false Submit a HTTP DELETE request
curl -i -X DELETE "http://<host>:<port>/webhdfs/v1/<path>?op=DELETE
[&recursive=<true|false>]"
The client receives a response with a boolean JSON object: HTTP/1.1 200 OK
Content-Type: application/json
Transfer-Encoding: chunked {"boolean": true}
*/
bool CHttpFSClient::del(string& path, bool recursive)
{
string url = m_hostaddr + path + "?op=DELETE&user.name="+m_username;
if(recursive) url+="&recursive=true"; string response_contents;
char redir_url[];
long response_code=;
long headerlen = ;
bool curlerr = false; CURL *curl;
CURLcode res; // get a curl handle
curl = curl_easy_init();
if(curl) {
// Set the DELETE command
curl_easy_setopt(curl, CURLOPT_CUSTOMREQUEST, "DELETE");
curl_easy_setopt(curl, CURLOPT_URL, url.c_str());
curl_easy_setopt(curl, CURLOPT_HEADER, 1L);
curl_easy_setopt(curl, CURLOPT_TIMEOUT, m_timeout);
curl_easy_setopt(curl, CURLOPT_CONNECTTIMEOUT, m_conntimeout);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, CHttpFSClient::memwrite_data);
curl_easy_setopt(curl,CURLOPT_WRITEDATA,&response_contents); res = curl_easy_perform(curl);
// Check for errors
if(res != CURLE_OK)
{
fprintf(stderr, "hdfs del failed: %s\n",
curl_easy_strerror(res));
curlerr = true;
}
else
{
res = curl_easy_getinfo(curl,CURLINFO_RESPONSE_CODE,&response_code);
if(res != CURLE_OK)
{
fprintf(stderr, "curl_easy_getinfo CURLINFO::CURLINFO_RESPONSE_CODE failed: %s\n",
curl_easy_strerror(res));
curlerr = true;
}
res = curl_easy_getinfo(curl,CURLINFO_HEADER_SIZE,&headerlen);
if(res != CURLE_OK)
{
fprintf(stderr, "curl_easy_getinfo CURLINFO::CURLINFO_HEADER_SIZE failed: %s\n",
curl_easy_strerror(res));
curlerr = true;
}
} }
// always cleanup!!!!
curl_easy_cleanup(curl);
if(curlerr)
return false; if(response_code == )
{
Json::Reader reader;
Json::Value root;
const char *content = response_contents.c_str();
if(!reader.parse(content+headerlen,content+response_contents.length(),root,false))
return false; return root["boolean"].asBool();
}
else
return false;
} /*
Open and Read a File of remote an write to local_file
@param @remote_file
@param @local_file Submit a HTTP GET request with automatically following redirects.
curl -i -L "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=OPEN
[&offset=<LONG>][&length=<LONG>][&buffersize=<INT>]"
The request is redirected to a datanode where the file data can be read: HTTP/1.1 307 TEMPORARY_REDIRECT
Location: http://<DATANODE>:<PORT>/webhdfs/v1/<PATH>?op=OPEN...
Content-Length: 0
The client follows the redirect to the datanode and receives the file data: HTTP/1.1 200 OK
Content-Type: application/octet-stream
Content-Length: 22 Hello, webhdfs user!
*/
bool CHttpFSClient::read(string& rem_file,string& local_file, long offset, long length)
{
char url[];
if(offset != && length != )
sprintf_s(url,,"%s%s?op=OPEN&user.name=%s&offset=%ld&length=%ld",m_hostaddr.c_str(),rem_file.c_str(),m_username.c_str(),offset,length);
else
sprintf_s(url,,"%s%s?op=OPEN&user.name=%s",m_hostaddr.c_str(),rem_file.c_str(),m_username.c_str()); long response_code=;
bool curlerr = false; CURL *curl;
CURLcode res; // get a curl handle
curl = curl_easy_init();
if(curl) {
// HTTP GET please
curl_easy_setopt(curl, CURLOPT_HTTPGET, 1L);
// specify target URL, and note that this URL should include a file name, not only a directory
curl_easy_setopt(curl, CURLOPT_URL, url);
/* send all data to this function */
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, CHttpFSClient::filewrite_data); FILE * pagefile;
pagefile = fopen(local_file.c_str(), "wb");
if(GetLastError() != )
return false; // write the page body to this file handle
curl_easy_setopt(curl, CURLOPT_WRITEDATA, pagefile); // Now run off and do what you've been told!
res = curl_easy_perform(curl);
// Check for errors
if(res != CURLE_OK)
{
fprintf(stderr, "get file from hdfs failed: %s\n",
curl_easy_strerror(res));
curlerr = true;
} fclose(pagefile); // close the local file
} // always cleanup!!!!
curl_easy_cleanup(curl);
if(curlerr)
return false; return true;
} /*
list a directory
@param $dir string, the dir to list
@return json object Submit a HTTP GET request.
curl -i "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=LISTSTATUS"
The client receives a response with a FileStatuses JSON object: HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 427 {
"FileStatuses":
{
"FileStatus":
[
{
"accessTime" : 1320171722771,
"blockSize" : 33554432,
"group" : "supergroup",
"length" : 24930,
"modificationTime": 1320171722771,
"owner" : "webuser",
"pathSuffix" : "a.patch",
"permission" : "644",
"replication" : 1,
"type" : "FILE"
},
{
"accessTime" : 0,
"blockSize" : 0,
"group" : "supergroup",
"length" : 0,
"modificationTime": 1320895981256,
"owner" : "szetszwo",
"pathSuffix" : "bar",
"permission" : "711",
"replication" : 0,
"type" : "DIRECTORY"
},
...
]
}
}
*/
bool CHttpFSClient::ls(string& rem_path,vector<FileStatus>& results)
{
string url = m_hostaddr + rem_path + "?op=LISTSTATUS&user.name="+m_username; long response_code=;
long headerlen = ;
bool curlerr = false;
string response_contents; CURL *curl;
CURLcode res; // get a curl handle
curl = curl_easy_init();
if(curl) {
curl_easy_setopt(curl, CURLOPT_HTTPGET, 1L);
curl_easy_setopt(curl, CURLOPT_URL, url.c_str());
curl_easy_setopt(curl, CURLOPT_HEADER, 1L);
curl_easy_setopt(curl, CURLOPT_TIMEOUT, m_timeout);
curl_easy_setopt(curl, CURLOPT_CONNECTTIMEOUT, m_conntimeout);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, CHttpFSClient::memwrite_data);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, &response_contents);
res = curl_easy_perform(curl);
// Check for errors
if(res != CURLE_OK)
{
fprintf(stderr, "hdfs GETFILESTATUS failed: %s\n",
curl_easy_strerror(res));
curlerr = true;
}
else
{
res = curl_easy_getinfo(curl,CURLINFO_RESPONSE_CODE,&response_code);
if(res != CURLE_OK)
{
fprintf(stderr, "curl_easy_getinfo CURLINFO::CURLINFO_RESPONSE_CODE failed: %s\n",
curl_easy_strerror(res));
curlerr = true;
}
res = curl_easy_getinfo(curl,CURLINFO_HEADER_SIZE,&headerlen);
if(res != CURLE_OK)
{
fprintf(stderr, "curl_easy_getinfo CURLINFO::CURLINFO_HEADER_SIZE failed: %s\n",
curl_easy_strerror(res));
curlerr = true;
}
} }
// always cleanup!!!!
curl_easy_cleanup(curl);
if(curlerr)
return false; if(response_code == )
{
Json::Reader reader;
Json::Value root;
const char *content = response_contents.c_str();
if(!reader.parse(content+headerlen,content+response_contents.length(),root,false))
return false; if(root.empty()) return false;
Json::Value FileStatuses = root.get("FileStatuses",Json::nullValue);
if(FileStatuses == Json::nullValue) return false;
Json::Value FileStatusVec = FileStatuses.get("FileStatus",Json::nullValue);
if(FileStatusVec == Json::nullValue) return false;
results.clear();
int size = FileStatusVec.size();
for (int i=; i<size; ++i)
{
FileStatus fst;
fst.accessTime = FileStatusVec[i]["accessTime"].asInt64();
fst.blocksize = FileStatusVec[i]["blockSize"].asInt64();
fst.group = FileStatusVec[i]["group"].asString();
fst.length = FileStatusVec[i]["length"].asInt64();
fst.modificationTime = FileStatusVec[i]["modificationTime"].asInt64();
fst.owner = FileStatusVec[i]["owner"].asString();
fst.pathSuffix = FileStatusVec[i]["pathSuffix"].asString();
fst.permission = FileStatusVec[i]["permission"].asString();
fst.replication = FileStatusVec[i]["replication"].asInt();
fst.type = FileStatusVec[i]["type"].asString(); results.push_back(fst);
}
showFileStatus(results); return true;
}
else
return false;
} void CHttpFSClient::showFileStatus(vector<FileStatus>& results)
{
//print result
printf("path\towner\tlength\trep\n");
for (vector<FileStatus>::const_iterator itr = results.begin();itr != results.end(); itr++)
{
printf("%s\t%s\t%ld\t%d\n",itr->pathSuffix.c_str(),itr->owner.c_str(),itr->length,itr->replication);
} } size_t CHttpFSClient::fileread_callback(void *ptr, size_t size, size_t nmemb, void *stream)
{
size_t retcode;
curl_off_t nread; /* in real-world cases, this would probably get this data differently
as this fread() stuff is exactly what the library already would do
by default internally */
retcode = fread(ptr, size, nmemb, (FILE *)stream); nread = (curl_off_t)retcode; fprintf(stderr, "*** We read %" CURL_FORMAT_CURL_OFF_T
" bytes from file\n", nread); return retcode;
} size_t CHttpFSClient::filewrite_data(const char *ptr, size_t size, size_t nmemb, void *stream)
{
size_t written = fwrite(ptr, size, nmemb, (FILE *)stream);
return written;
} size_t CHttpFSClient::memwrite_data(const char *contents, size_t size, size_t nmemb, string *stream)
{
assert(stream != NULL);
size_t len = size * nmemb;
stream->append(contents, len);
return len;
}
size_t CHttpFSClient::header_callback(const char *ptr, size_t size, size_t nmemb, std::string *stream)
{
assert(stream != NULL);
size_t len = size * nmemb;
stream->append(ptr, len);
return len;
} int main(int argc, _TCHAR* argv[])
{
string hostaddr = "http://192.168.0.111:14000/webhdfs/v1";
string username = "hadoop";
CHttpFSClient httpfs(hostaddr,username);
vector<FileStatus> results;
string local_file = ".\\test.docx";
string rem_path = "/test.docx";
//httpfs.create(local_file,rem_path);
//httpfs.append(local_file,rem_path);
httpfs.read(rem_path,local_file);
//httpfs.ls(rem_path,results);
//httpfs.del(rem_path); getchar();
return ;
}
3.工程代码下载
http://files.cnblogs.com/files/hikeepgoing/HttpfsClient.rar
基于Httpfs访问HDFS的C++实现的更多相关文章
- 【转】Python 访问 HDFS
1.前言 hdfs , Hadoop Distributed File System.Hadoop的分布式文件系统,安全行和扩展性没得说. 访问HDFS的方式有以下几种: 命令行方式:FS Shell ...
- 通过Thrift访问HDFS分布式文件系统的性能瓶颈分析
通过Thrift访问HDFS分布式文件系统的性能瓶颈分析 引言 Hadoop提供的HDFS布式文件存储系统,提供了基于thrift的客户端访问支持,但是因为Thrift自身的访问特点,在高并发的访问情 ...
- 【原创】大叔经验分享(49)hue访问hdfs报错/hue访问oozie editor页面卡住
hue中使用hue用户(hue admin)访问hdfs报错: Cannot access: /. Note: you are a Hue admin but not a HDFS superuser ...
- memcached基于socket访问memcache缓存服务器
memcached基于socket访问memcache缓存服务器 操作memcache常用三种方法: .memcache基于php_memcache.dll扩展(php扩展) .memcached基于 ...
- windows通过thrift访问hdfs
thirift是一个支持跨种语言的远程调用框架,通过thrift远程调用框架,结合hadoop1.x中的thriftfs,编写了一个针对hadoop2.x的thriftfs,供外部程序调用. 1.准备 ...
- 源码编译安装LAMP环境及配置基于域名访问的多虚拟主机
实验环境及软件版本: CentOS版本: 6.6(2.6.32.-504.el6.x86_64) apache版本: apache2.2.27 mysql版本: Mysql-5.6.23 php版本 ...
- 利用JavaAPI访问HDFS的文件
body{ font-family: "Microsoft YaHei UI","Microsoft YaHei",SimSun,"Segoe UI& ...
- Hadoop(五)搭建Hadoop与Java访问HDFS集群
前言 上一篇详细介绍了HDFS集群,还有操作HDFS集群的一些命令,常用的命令: hdfs dfs -ls xxx hdfs dfs -mkdir -p /xxx/xxx hdfs dfs -cat ...
- Hadoop(八)Java程序访问HDFS集群中数据块与查看文件系统
前言 我们知道HDFS集群中,所有的文件都是存放在DN的数据块中的.那我们该怎么去查看数据块的相关属性的呢?这就是我今天分享的内容了 一.HDFS中数据块概述 1.1.HDFS集群中数据块存放位置 我 ...
随机推荐
- javascript代码解释执行过程
javascript是由浏览器解释执行的脚本语言,不同于java c,需要先编译后运行,javascript 由浏览器js解释器进行解释执行,总的过程分为两大块,预编译期和执行期 下面的几个demo解 ...
- java 考试试题
Java基础部分 基础部分的顺序:基本语法,类相关的语法,内部类的语法,继承相关的语法,异常的语法,线程的语法,集合的语法,io 的语法,虚拟机方面的语法,其他.有些题来自网上搜集整理,有些题来自学员 ...
- 小白日记52:kali渗透测试之Web渗透-HTTPS攻击(Openssl、sslscan、sslyze、检查SSL的网站)
HTTPS攻击 全站HTTPS正策划稿那位潮流趋势 如:百度.阿里 HTTPS的作用 CIA 解决的是信息传输过程中数据被篡改.窃取 [从中注入恶意代码,多为链路劫持] 加密:对称.非对称.单向 HT ...
- WPF 之 WPF应用程序事件
当新建一个wpf应用程序,会自动生成一个App.xaml和MainWindow.xaml文件. 其中 App.xam 用来设置Application,应用程序的起始文件和资源及应用程序的一些属性和事件 ...
- 属性通知之INotifyPropertyChanged
为什么后台绑定的值改变了前台不发生变化了? 针对这个初学者很容易弄错的问题,这里介绍一下INotifyPropertyChanged的用法 INotifyPropertyChanged:用于绑定属性更 ...
- Joynet示例:知乎爬虫(搜索关键字相关回答,并下载其中的---图(mei)片(nv))
先贴爬虫程序下载地址:http://pan.baidu.com/s/1c2lxl1e 下载解压后 可直接运行 其中的 run.bat:或者你手动打开命令行执行:Joynet examples\Spid ...
- iOS UIImage DownLoad图片的下载缓存全部在此
iOS图片的下载缓存全部在此 分类: iOS编程 -- : 2075人阅读 评论() 收藏 举报 注意: 我的文章只写给自己看 ------------------------------------ ...
- SQL语言简介
什么是SQL语言? 是高级非过程化语言(是沟通数据库服务器和客户端的工具) 作用:存取,查询,更新和管理关系数据库系统 SQL语言分类: 1.DDL:数据定义语言 作用:定义和管理数据当中的各种对象 ...
- 剑指Offer08 二进制中1的个数
/************************************************************************* > File Name: 08_NumOf1 ...
- 剑指Offer33 第一个只出现一次的字符
/************************************************************************* > File Name: 33_FirstN ...