Memory Safety

Today’s lecture will focus on memory errors that often arise in programs. We will look at how these errors motivate Rust’s ownership model, and explain how Rust prevents them.

Pre-class exercise

This exercise was developed by Will Crichton for CS 242. Thank you, Will, for letting us borrow this material!

Before class on Thursday, please spend just 10 minutes reviewing the following C implementation of a vector. There are at least 7 bugs. You don’t need to catch ‘em all, but try to spot as many as you can.

Please write down the bugs you find. We will be reviewing your findings at the start of class.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>

// There are at least 7 bugs relating to memory on this snippet.
// Find them all!

// Vec is short for "vector", a common term for a resizable array.
// For simplicity, our vector type can only hold ints.
typedef struct {
int* data; // Pointer to our array on the heap
int length; // How many elements are in our array
int capacity; // How many elements our array can hold
} Vec;

Vec* vec_new() {
Vec vec;
vec.data = NULL;
vec.length = 0;
vec.capacity = 0;
return &vec;
}

void vec_push(Vec* vec, int n) {
if (vec->length == vec->capacity) {
int new_capacity = vec->capacity * 2;
int* new_data = (int*) malloc(new_capacity);
assert(new_data != NULL);

for (int i = 0; i < vec->length; ++i) {
new_data[i] = vec->data[i];
}

vec->data = new_data;
vec->capacity = new_capacity;
}

vec->data[vec->length] = n;
++vec->length;
}

void vec_free(Vec* vec) {
free(vec);
free(vec->data);
}

void main() {
Vec* vec = vec_new();
vec_push(vec, 107);

int* n = &vec->data[0];
vec_push(vec, 110);
printf("%d\n", *n);

free(vec->data);
vec_free(vec);
}

代码结构

  • Vec是一个结构体,包含一个动态数组指针data、当前数组的长度length和数组容量capacity
  • 提供了几个主要功能:
    • vec_new()创建新的Vec实例;
    • vec_push()Vec添加元素并在必要时调整容量;
    • vec_free()释放Vec的内存;
    • main()函数测试Vec的功能。

错误分析

  1. 局部变量地址错误(vec_new函数)

    1
    2
    Vec vec;
    return &vec;
    • vec_new()函数返回了一个局部变量vec的地址。

    • 在函数返回后,vec的内存会被释放,因此返回的指针指向无效内存,会导致不确定的行为。

    • 修正建议:使用malloc分配Vec的内存。

      1
      2
      3
      4
      5
      Vec* vec = (Vec*) malloc(sizeof(Vec));
      vec->data = NULL;
      vec->length = 0;
      vec->capacity = 0;
      return vec;
  2. 未初始化容量(vec_push函数)

    1
    int new_capacity = vec->capacity * 2;
    • capacity为0时,vec->capacity * 2依旧为0,这导致无法为data分配空间。

    • 修正建议:在容量为0时,可以将new_capacity设为1。

      1
      int new_capacity = (vec->capacity == 0) ? 1 : vec->capacity * 2;
  3. malloc分配大小错误

    1
    int* new_data = (int*) malloc(new_capacity);
    • malloc接受的参数是字节数,而不是元素数量。

    • 因此应该分配new_capacity * sizeof(int)字节的空间。

    • 修正建议

      1
      int* new_data = (int*) malloc(new_capacity * sizeof(int));
  4. 未释放旧data内存(内存泄漏)

    • 在调整容量时,旧的data指针没有被释放,这导致内存泄漏。

    • 修正建议:在重新分配新空间前释放旧的data指针。

      1
      free(vec->data);
  5. 空指针访问(vec_push函数)

    1
    vec->data[vec->length] = n;
    • 在初始情况下,如果vec->dataNULL(即尚未分配空间),访问vec->data[0]会导致空指针解引用,造成运行错误。
    • 修正建议:确保在分配内存后才访问vec->data
  6. 错误释放顺序(vec_free函数)

    1
    2
    free(vec);
    free(vec->data);
    • 应先释放data,再释放Vec结构体本身的指针,否则data的内存可能在释放vec后失效。

    • 修正建议

      1
      2
      free(vec->data);
      free(vec);
  7. 重复释放(main函数)

    1
    2
    free(vec->data);
    vec_free(vec);
    • main中,vec->data被手动释放后,又在vec_free中再次释放,导致“双重释放”错误。
    • 修正建议:只在vec_free中释放data,无需在main中重复释放。
  8. 悬挂指针问题

    1
    2
    3
    int* n = &vec->data[0];
    vec_push(vec, 110);
    printf("%d\n", *n);
    • vec_push可能导致data重新分配新地址,因此n指向的地址可能无效,导致悬挂指针(即指向已释放或无效的内存)。
    • 修正建议:在vec_push后避免使用旧的指针n

修正后

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>

typedef struct {
int* data;
int length;
int capacity;
} Vec;

Vec* vec_new() {
Vec* vec = (Vec*) malloc(sizeof(Vec));
vec->data = NULL;
vec->length = 0;
vec->capacity = 0;
return vec;
}

void vec_push(Vec* vec, int n) {
if (vec->length == vec->capacity) {
int new_capacity = (vec->capacity == 0) ? 1 : vec->capacity * 2;
int* new_data = (int*) malloc(new_capacity * sizeof(int));
assert(new_data != NULL);

for (int i = 0; i < vec->length; ++i) {
new_data[i] = vec->data[i];
}

free(vec->data);
vec->data = new_data;
vec->capacity = new_capacity;
}

vec->data[vec->length] = n;
++vec->length;
}

void vec_free(Vec* vec) {
free(vec->data);
free(vec);
}

int main() {
Vec* vec = vec_new();
vec_push(vec, 107);

vec_push(vec, 110);
printf("%d\n", vec->data[0]);

vec_free(vec);
return 0;
}