数据导入-文档中心-金山云

MapReduce (KMR)

View more results

Document title with current keyword not found

Data import to KS3

KMR cluster can access to standard storage service (KS3) directly. Before you use KMR, we recommend that you open KS3 service to consolidate the computing programs and raw data into KS3 for easy management and persistent storage.

(1) Go to KS3 console to create the storage space at http://ks3.ksyun.com/console.html#/
(2) Select “Region” (you need to choose the same region as KMR service, because KMR cannot access KS3 across regions), and enter the space name. If public read/write is not required, select “Private” for the access control.
(3) Enter the space and select "Content Management" to create a directory, or upload files directly through the browser. For files over 500M, you can upload them to KS3 SDK with SDK or other tools: https://github.com/ks3sdk KS3 upload tool is available at: http://www.ksyun.com/doc/art/id/432

Data import to HDFS

Usually, the raw data that KMR needs to process is stored directly on KS3, so various computing jobs can be executed. In order to obtain better data processing performance and take full advantage of the Hadoop data localization, we can copy the data from KS3 to HDFS file system of KMR cluster.
DistCp (Distributed Copy) is a tool for copying the data within and between large-scale clusters. It uses Map/Reduce for the file distribution, error handling and recovery, as well as the report generation. Kingsoft Cloud KMR uses special technology, and you can use DistCp tool to copy the data directly between HDFS and KS3. Steps:

Connect to the master data through SSH. Please refer to SSH Connection Guide
Enter the command: su hadoop switch to hadoop user
Execute the command in the following format: hadoop distcp

Example:
Upload from HDFS to KS3
```hadoop distcp /user/hadoop/conf/hive-site.xml ks3://testbarcket/kmr/

Copy from KS3 to HDFS<br/>
```hadoop distcp ks3://testbarcket/kmr/hive-site.xml /user/hadoop/conf/

For more use cases of Discp, please refer to DisCp Guide

On this page

Previous:User Guide

Next:Cluster details

Pure ModeNormal Mode

Pure Mode

Click to preview the document content in full screen

MapReduce (KMR)

Product Introduction

Overview

Features

Benefits

Scenarios

Concepts

Pricing

KMR product pricing and purchasing

Quick Start

Step1：Access to purchase

Step2：Create KMR Cluster

Step3：View Cluster

User Guide

Data import guide

Cluster details

Adjust cluster size

EIP resource management

SSH connection guide

Monitoring guide

Privilege configuration considerations

Compute

Cloud Monitor

Compliance

Networking

Security

Storage&CDN

Database

Legal

ActionTrail

Video Services

Data Analysis

Account Management

Data import guide

Data import to KS3

Data import to HDFS