Home | 简体中文 | 繁体中文 | 杂文 | Github | 知乎专栏 | 51CTO学院 | CSDN程序员研修院 | OSChina 博客 | 腾讯云社区 | 阿里云栖社区 | Facebook | Linkedin | Youtube | 打赏(Donations) | About
知乎专栏多维度架构

15.3. Keyspace

15.3.1. Schema

15.3.1.1. Keyspace

15.3.1.2. Column family

Name
Column
Super column
Sorting

15.3.2. Keyspace example

例 15.1. Twitter

				
<Keyspace Name="Twitter">
<ColumnFamily CompareWith="UTF8Type" Name="Statuses" />
<ColumnFamily CompareWith="UTF8Type" Name="StatusAudits" />
<ColumnFamily CompareWith="UTF8Type" Name="StatusRelationships"
CompareSubcolumnsWith="TimeUUIDType" ColumnType="Super" />
<ColumnFamily CompareWith="UTF8Type" Name="Users" />
<ColumnFamily CompareWith="UTF8Type" Name="UserRelationships"
CompareSubcolumnsWith="TimeUUIDType" ColumnType="Super" />
</Keyspace>
				
				

例 15.2. Twissandra

				
  <Keyspaces>
    <Keyspace Name="Twissandra">
       <ColumnFamily CompareWith="UTF8Type" Name="User"/>
      <ColumnFamily CompareWith="BytesType" Name="Username"/>
      <ColumnFamily CompareWith="BytesType" Name="Friends"/>
      <ColumnFamily CompareWith="BytesType" Name="Followers"/>
      <ColumnFamily CompareWith="UTF8Type" Name="Tweet"/>
      <ColumnFamily CompareWith="LongType" Name="Timeline"/>
      <ColumnFamily CompareWith="LongType" Name="Userline"/>

      <ReplicaPlacementStrategy>org.apache.cassandra.locator.RackUnawareStrategy</ReplicaPlacementStrategy>

      <!-- Number of replicas of the data -->
      <ReplicationFactor>1</ReplicationFactor>
      <EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch>

    </Keyspace>
  </Keyspaces>
  				
				

Schema Layout

In Cassandra, the way that your data is structured is very closely tied to how how it will be retrieved. Let's start with the user ColumnFamily. The key is a user id, and the columns are the properties on the user:

User = {
    'a4a70900-24e1-11df-8924-001ff3591711': {
        'id': 'a4a70900-24e1-11df-8924-001ff3591711',
        'username': 'ericflo',
        'password': '****',
    },
}
				

Since some of the URLs on the site actually have the username, we need to be able to map from the username to the user id:

Username = {
    'ericflo': {
        'id': 'a4a70900-24e1-11df-8924-001ff3591711',
    },
}
				

Friends and followers are keyed by the user id, and then the columns are the friend user id and follower user ids, and we store a timestamp as the value because it's interesting information to have:

Friends = {
    'a4a70900-24e1-11df-8924-001ff3591711': {
        # friend id: timestamp of when the friendship was added
        '10cf667c-24e2-11df-8924-001ff3591711': '1267413962580791',
        '343d5db2-24e2-11df-8924-001ff3591711': '1267413990076949',
        '3f22b5f6-24e2-11df-8924-001ff3591711': '1267414008133277',
    },
}

Followers = {
    'a4a70900-24e1-11df-8924-001ff3591711': {
        # friend id: timestamp of when the followership was added
        '10cf667c-24e2-11df-8924-001ff3591711': '1267413962580791',
        '343d5db2-24e2-11df-8924-001ff3591711': '1267413990076949',
        '3f22b5f6-24e2-11df-8924-001ff3591711': '1267414008133277',
    },
}
				

Tweets are stored in a way similar to users:

Tweet = {
    '7561a442-24e2-11df-8924-001ff3591711': {
        'id': '89da3178-24e2-11df-8924-001ff3591711',
        'user_id': 'a4a70900-24e1-11df-8924-001ff3591711',
        'body': 'Trying out Twissandra. This is awesome!',
        '_ts': '1267414173047880',
    },
}
				

The Timeline and Userline column families keep track of which tweets should appear, and in what order. To that effect, the key is the user id, the column name is a timestamp, and the column value is the tweet id:

Timeline = {
    'a4a70900-24e1-11df-8924-001ff3591711': {
        # timestamp of tweet: tweet id
        1267414247561777: '7561a442-24e2-11df-8924-001ff3591711',
        1267414277402340: 'f0c8d718-24e2-11df-8924-001ff3591711',
        1267414305866969: 'f9e6d804-24e2-11df-8924-001ff3591711',
        1267414319522925: '02ccb5ec-24e3-11df-8924-001ff3591711',
    },
}

Userline = {
    'a4a70900-24e1-11df-8924-001ff3591711': {
        # timestamp of tweet: tweet id
        1267414247561777: '7561a442-24e2-11df-8924-001ff3591711',
        1267414277402340: 'f0c8d718-24e2-11df-8924-001ff3591711',
        1267414305866969: 'f9e6d804-24e2-11df-8924-001ff3591711',
        1267414319522925: '02ccb5ec-24e3-11df-8924-001ff3591711',
    },
}