File I/O in C
File Input/Output (I/O) in C allows programs to interact with files stored on secondary storage, enabling data persistence. This means data can be saved and retrieved even after the program terminates.
Data persistence through files
To retain information beyond the execution of a program, we rely on persistent storage—typically achieved through files on disk.
A file is:
- An abstraction managed by the operating system's file system.
- A named collection of data stored on secondary storage (such as a hard drive or SSD).
- Organized as a sequence of records or bytes.
Files are generally categorized as:
- Text files: Store data as sequences of human-readable characters (e.g., ASCII, UTF-8). Suitable for configuration files, logs, and documents.
- Binary files: Store data in raw byte format, mirroring how data is represented in memory. Used for images, executables, and structured data.
- On Windows, file extensions (like
.txt
,.bin
,.exe
) are commonly used to indicate file type, and the OS may treat files differently based on their extension. For example, opening a file in text mode ("rt"
) or binary mode ("rb"
) can affect how line endings (\r\n
vs\n
) are handled. - On Unix/Linux, file extensions are not required or enforced by the OS—they are just part of the filename. The kernel treats all files as streams of bytes, regardless of extension. There is no technical distinction between "text" and "binary" files at the OS level; it's up to programs to interpret the contents.
- In fact, in Unix-like systems, everything is a file: not just regular files, but also devices, pipes, and even directories are accessed using file descriptors. This unified approach allows powerful tools like
cat
,less
, orvim
to open and edit almost any file, though not all files are human-readable as text.
Using files allows programs to save, retrieve, and share data reliably, making persistent data management possible across program runs.
In C, file I/O operations are typically buffered. This means data is temporarily stored in a buffer (a small region of memory) before being written to the actual file, or read from the file into the buffer. This improves efficiency by reducing the number of direct interactions with the storage device. The C standard library functions like fopen
, fread
, fwrite
, fprintf
, fscanf
, etc., operate on these buffered streams.
Core file operations
Working with files in C generally involves three main phases:
- Opening a file:
- Request access to the file from the operating system.
- The OS prepares internal data structures to manage the file.
- A
FILE
pointer is returned to your program to interact with the file.
- Reading/Writing data:
- Perform read or write operations using the
FILE
pointer. - Data is typically transferred via a buffer.
- Perform read or write operations using the
- Closing a file:
- Any pending data in the output buffer is written to the file (flushed).
- Resources allocated by the OS for the file are released.
- The
FILE
pointer is no longer valid.
File opening modes
When opening a file, you must specify the mode, which dictates how the file can be accessed. Key modes include:
Mode | Description |
---|---|
"r" | Opens a text file for reading. The file must exist. |
"w" | Opens a text file for writing. If the file exists, its contents are overwritten. If it doesn't exist, a new file is created. |
"a" | Opens a text file for appending. Data is written to the end of the file. If the file doesn't exist, a new file is created. |
"rb" | Opens a binary file for reading. |
"wb" | Opens a binary file for writing (truncates or creates). |
"ab" | Opens a binary file for appending (creates if non-existent). |
Other modes like "r+"
, "w+"
, "a+"
(and their binary counterparts "rb+"
, "wb+"
, "ab+"
) allow both reading and writing.
The file position indicator (cursor)
When a file is opened, the system maintains a file position indicator (often conceptualized as a cursor). This indicator marks the current position within the file where the next read or write operation will occur.
- For modes
"r"
,"rb"
,"w"
,"wb"
, the cursor is initially at the beginning of the file.- Remember: for
"w"
and"wb"
, existing content is deleted.
- Remember: for
- For modes
"a"
,"ab"
, the cursor is initially at the end of the file. - After each read or write operation, the cursor automatically advances by the number of bytes read or written.
Working with files
The FILE
structure
All file operations in C are performed using a pointer to a FILE
structure. This structure is defined in <stdio.h>
and holds information about the file stream, such as the buffer, current position, and error indicators.
You don't need to know the internal details of the FILE
structure. You'll simply declare a pointer of type FILE*
and use it with standard library functions.
#include <stdio.h>
FILE *fptr; // Declare a file pointer
Opening a file: fopen()
The fopen()
function is used to open a file.
FILE* fopen(const char *filename, const char *mode);
filename
: A string containing the name (and path) of the file.mode
: A string specifying the access mode (e.g.,"r"
,"w"
,"rb"
).- Return value:
- On success,
fopen()
returns aFILE
pointer. - On failure (e.g., file not found, no permission), it returns
NULL
.
- On success,
Always check the return value of fopen()
:
#include <stdio.h>
#include <stdlib.h> // For exit()
int main() {
FILE *fptr;
fptr = fopen("myfile.txt", "w");
if (fptr == NULL) {
perror("Error opening file"); // Prints a system error message
// Or: fprintf(stderr, "Error opening file myfile.txt\n");
return 1; // Or exit(1);
}
// ... proceed with file operations ...
// fclose(fptr); // Don't forget to close (shown later)
return 0;
}
Closing a file: fclose()
The fclose()
function is used to close an opened file.
int fclose(FILE *stream);
stream
: TheFILE
pointer to the file to be closed.- Return value:
- Returns
0
on success. - Returns
EOF
(a special constant, usually -1) on error.
- Returns
fclose()
flushes any unwritten data from the buffer to the file and releases system resources.
#include <stdio.h>
#include <stdlib.h>
int main() {
FILE *fptr;
fptr = fopen("myfile.txt", "w");
if (fptr == NULL) {
fprintf(stderr, "Cannot open file\n");
return 1;
}
// Write something to demonstrate buffer flushing (optional here)
fprintf(fptr, "Hello, World!\n");
if (fclose(fptr) == EOF) {
fprintf(stderr, "Error closing file\n");
return 1;
}
printf("File opened, written, and closed successfully.\n");
return 0;
}
Standard I/O streams
When a C program starts, three standard I/O streams are automatically opened:
stdin
: Standard input (usually the keyboard).stdout
: Standard output (usually the terminal/screen).stderr
: Standard error (usually the terminal/screen, for error messages).
These are FILE
pointers. Functions like printf()
and scanf()
are convenient wrappers:
printf(...)
is equivalent tofprintf(stdout, ...)
scanf(...)
is equivalent tofscanf(stdin, ...)
1️⃣ Text file I/O
Writing to text files
fprintf()
: Writes formatted output to a file.int fprintf(FILE *stream, const char *format, ...);
// Returns the number of characters written, or a negative value on error.Example:
fprintf(fptr, "Name: %s, Age: %d\n", "Alice", 30);
fputc()
: Writes a single character.int fputc(int character, FILE *stream);
// Returns the character written, or EOF on error.fputs()
: Writes a string (does not append a newline character automatically).int fputs(const char *str, FILE *stream);
// Returns a non-negative value on success, or EOF on error.
Reading from text files
fscanf()
: Reads formatted input from a file.int fscanf(FILE *stream, const char *format, ...);
// Returns the number of input items successfully matched and assigned,
// or EOF if an input failure occurs before any conversion.Example:
char name[50];
int age;
fscanf(fptr, "%s %d", name, &age);fgetc()
: Reads a single character.int fgetc(FILE *stream);
// Returns the character read (as an int), or EOF on end-of-file or error.fgets()
: Reads a string (safer thanfscanf
for strings as it prevents buffer overflows).char* fgets(char *str, int num, FILE *stream);
// Reads up to num-1 characters, or until a newline, or EOF.
// Stores the newline if read. Appends a null terminator.
// Returns str on success, NULL on error or EOF if no characters were read.Example:
char line[256];
if (fgets(line, sizeof(line), fptr) != NULL) {
// Process the line
}
2️⃣ Binary file I/O
Binary files store data as raw bytes, exactly as they are represented in memory. This is efficient for non-textual data like images, audio, or complex data structures.
Writing to binary files: fwrite()
size_t fwrite(const void *ptr, size_t size, size_t nmemb, FILE *stream);
ptr
: Pointer to the array of elements to be written.size
: Size in bytes of each element to be written.nmemb
: Number of elements, each one with a size ofsize
bytes.stream
: TheFILE
pointer.- Return value: The number of elements successfully written (which may be less than
nmemb
if an error occurs).
Reading from binary files: fread()
size_t fread(void *ptr, size_t size, size_t nmemb, FILE *stream);
ptr
: Pointer to a block of memory with a minimum size ofsize*nmemb
bytes.size
: Size in bytes of each element to be read.nmemb
: Number of elements, each one with a size ofsize
bytes.stream
: TheFILE
pointer.- Return value: The number of elements successfully read (which may be less than
nmemb
if the end of the file is reached or an error occurs).
sizeof
Always use the sizeof
operator to determine the size
argument for fread
and fwrite
. This ensures portability, as data type sizes can vary across different systems.
Example: fwrite
and fread
with an array of integers
#include <stdio.h>
#include <stdlib.h>
#define NUM_INTS 5
int main() {
FILE *fptr;
int numbers[NUM_INTS] = {10, 20, 30, 40, 50};
int read_numbers[NUM_INTS];
size_t items_written, items_read;
// Write to binary file
fptr = fopen("numbers.dat", "wb");
if (fptr == NULL) {
perror("Error opening file for writing");
return 1;
}
items_written = fwrite(numbers, sizeof(int), NUM_INTS, fptr);
if (items_written < NUM_INTS) {
fprintf(stderr, "Error writing to file or partial write.\n");
}
fclose(fptr);
printf("%zu integers written to numbers.dat\n", items_written);
// Read from binary file
fptr = fopen("numbers.dat", "rb");
if (fptr == NULL) {
perror("Error opening file for reading");
return 1;
}
items_read = fread(read_numbers, sizeof(int), NUM_INTS, fptr);
if (items_read < NUM_INTS) {
if (feof(fptr)) {
printf("End of file reached before reading all items.\n");
} else if (ferror(fptr)) {
perror("Error reading from file");
}
}
fclose(fptr);
printf("%zu integers read from numbers.dat: ", items_read);
for (size_t i = 0; i < items_read; i++) {
printf("%d ", read_numbers[i]);
}
printf("\n");
return 0;
}
5 integers written to numbers.dat
5 integers read from numbers.dat: 10 20 30 40 50
Text vs binary file operations: A comparison
Consider int x = 31466;
(which is 0x7AEA
in hexadecimal).
fprintf(fptr, "%d", x);
- Converts the integer
31466
to its character representation"31466"
. - Writes these 5 characters (ASCII bytes) to the file.
- If you inspect the file with a hex editor (like
xxd
), you'd see the ASCII codes for '3', '1', '4', '6', '6'.$ xxd output_fprintf.txt
00000000: 3331 3436 36 31466
- Converts the integer
fwrite(&x, sizeof(int), 1, fptr);
- Takes the raw binary representation of
x
from memory (e.g., 4 bytes for a typicalint
). - Writes these bytes directly to the file.
- If
int
is 4 bytes and the system is little-endian,31466
(0x00007AEA) would be stored asEA 7A 00 00
.$ xxd output_fwrite.bin
00000000: ea7a 0000 .z..
- Takes the raw binary representation of
This distinction is crucial:
- Text files are human-readable but less space-efficient
- Binary files are compact and faster for data exchange between programs but not directly readable
- Binary files preserve exact memory representation, which is vital for structured data
- Text files are better for interoperability between systems with different architectures
File positioning
You can control the file position indicator using these functions:
fseek()
: Sets the file position indicator to a specific location.int fseek(FILE *stream, long offset, int origin);
stream
: TheFILE
pointer.offset
: Number of bytes to move fromorigin
.origin
: Reference point for the offset. Can be:SEEK_SET
: Beginning of the file.SEEK_CUR
: Current file position.SEEK_END
: End of the file.
- Returns
0
on success, non-zero on error.
ftell()
: Returns the current value of the file position indicator.long ftell(FILE *stream);
// Returns current position in bytes from the beginning, or -1L on error.rewind()
: Sets the file position indicator to the beginning of the file.void rewind(FILE *stream);
// Equivalent to fseek(stream, 0L, SEEK_SET), but clears error indicators.
Example: Get file size using fseek
and ftell
#include <stdio.h>
int main() {
FILE *fptr = fopen("somefile.txt", "rb"); // Open in binary mode for accurate size
if (fptr == NULL) {
perror("Error opening file");
return 1;
}
if (fseek(fptr, 0, SEEK_END) != 0) { // Go to the end of the file
perror("fseek error");
fclose(fptr);
return 1;
}
long file_size = ftell(fptr); // Get current position (which is the size)
if (file_size == -1L) {
perror("ftell error");
fclose(fptr);
return 1;
}
printf("File size: %ld bytes\n", file_size);
rewind(fptr); // Go back to the beginning
// ... can now read from the start ...
fclose(fptr);
return 0;
}
For text files, the value returned by ftell
might not always correspond to the exact byte count from the beginning on some systems due to character encoding or line ending conversions. For accurate byte offsets and file sizes, it's often better to open files in binary mode ("rb"
, "wb"
, etc.) when using fseek
and ftell
.
Buffer management: fflush()
As mentioned, output to files is usually buffered. The fflush()
function forces a write of any unwritten data in the stream's buffer to the host environment's file.
int fflush(FILE *stream);
- If
stream
points to an output stream,fflush
causes unwritten data to be delivered to the host environment. - If
stream
isNULL
,fflush
flushes all output streams. - Returns
0
on success,EOF
on a write error.
fclose()
automatically calls fflush()
before closing the file. However, fflush()
can be useful if you need to ensure data is written at a specific point without closing the file, for example, in applications that log data continuously. Frequent use of fflush()
can degrade performance.
Error handling
After file operations, it's good practice to check for errors using:
ferror()
: Checks if the error indicator for the given stream is set.int ferror(FILE *stream);
// Returns non-zero if error indicator is set, 0 otherwise.feof()
: Checks if the end-of-file indicator for the given stream is set.int feof(FILE *stream);
// Returns non-zero if EOF indicator is set, 0 otherwise.perror()
: Prints a system error message corresponding to the current value oferrno
.void perror(const char *s);
// Prints s, a colon, a space, the error message, and a newline.clearerr()
: Clears the end-of-file and error indicators for the stream.void clearerr(FILE *stream);
📝 Exercises
Exercise 1: Storing person data
Create a program insert_person.c
that prompts the user to enter data for one person (surname, name, gender, birth year) and writes this data as a single record to a binary file named people.dat
.
Data structure:
- Surname (max 30 characters)
- Name (max 30 characters)
- Gender (a single character: 'M', 'F', 'N')
- Birth Year (integer)
Show Solution for Exercise 1
// insert_person.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h> // For strncpy, if needed for safety, though fgets is better
typedef struct {
char surname[31];
char name[31];
char gender;
int birth_year;
} Person;
// Helper function to read a line safely
void read_line(char *buffer, int size) {
if (fgets(buffer, size, stdin) != NULL) {
// Remove newline character if present
size_t len = strlen(buffer);
if (len > 0 && buffer[len - 1] == '\n') {
buffer[len - 1] = '\0';
}
} else {
buffer[0] = '\0'; // Ensure empty string on error
}
}
int main() {
Person p;
FILE *fptr;
printf("Enter surname (max 30 chars): ");
read_line(p.surname, sizeof(p.surname));
printf("Enter name (max 30 chars): ");
read_line(p.name, sizeof(p.name));
printf("Enter gender (M/F/N): ");
char gender_input[5]; // Buffer for gender input
read_line(gender_input, sizeof(gender_input));
if (strlen(gender_input) > 0) {
p.gender = gender_input[0];
} else {
p.gender = 'N'; // Default or error
}
printf("Enter birth year: ");
char year_input[10];
read_line(year_input, sizeof(year_input));
p.birth_year = atoi(year_input); // Convert string to int
fptr = fopen("people.dat", "ab"); // Open in append binary mode
if (fptr == NULL) {
perror("Error opening people.dat");
return 1;
}
size_t items_written = fwrite(&p, sizeof(Person), 1, fptr);
if (items_written < 1) {
fprintf(stderr, "Error writing person data to file.\n");
fclose(fptr);
return 1;
}
printf("Person data successfully written to people.dat\n");
fclose(fptr);
return 0;
}
To compile and run:
gcc insert_person.c -o insert_person
./insert_person
Exercise 2: Reading person data
Create a program read_people.c
that reads all person records from the binary file people.dat
(created in Exercise 1) and prints them to the console.
Show Solution for Exercise 2
// read_people.c
#include <stdio.h>
#include <stdlib.h>
typedef struct {
char surname[31];
char name[31];
char gender;
int birth_year;
} Person;
int main() {
Person p;
FILE *fptr;
int count = 0;
fptr = fopen("people.dat", "rb"); // Open in read binary mode
if (fptr == NULL) {
perror("Error opening people.dat. (Has it been created by insert_person?)");
return 1;
}
printf("--- People Data ---\n");
// Read records one by one until fread returns 0 (EOF or error)
while (fread(&p, sizeof(Person), 1, fptr) == 1) {
count++;
printf("Record %d:\n", count);
printf(" Surname: %s\n", p.surname);
printf(" Name: %s\n", p.name);
printf(" Gender: %c\n", p.gender);
printf(" Birth Year: %d\n", p.birth_year);
printf("--------------------\n");
}
if (ferror(fptr)) {
perror("Error reading from file");
} else if (count == 0 && feof(fptr)) {
printf("No records found in people.dat or file is empty.\n");
} else if (count > 0) {
printf("%d record(s) read successfully.\n", count);
}
fclose(fptr);
return 0;
}
To compile and run (after running insert_person
at least once):
gcc read_people.c -o read_people
./read_people
Exercise 3: Merging exam records
Create a program that processes student exam records from two different sources:
Requirements:
- Read two binary files (
transcript1.bin
andtranscript2.bin
) containing exam records - Each record consists of:
- A course name (string up to 20 characters plus null terminator)
- A grade (integer)
- For each exam, select the higher grade between the two files
- Write a new binary file (
results.bin
) containing all exams with their highest grades - After writing, read and display the output file to verify correctness
- Use proper error handling for all file operations
Notes:
- Both input files have the same structure and contain the same exams in the same order
- Organize your code using functions to handle file operations and data processing
- Define appropriate structures to represent the exam records
Show Solution for Exercise 3
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define MAX_COURSE_LEN 21 // 20 characters + null terminator
typedef struct {
char course[MAX_COURSE_LEN];
int grade;
} ExamRecord;
int main() {
ExamRecord rec1, rec2;
FILE *f1, *f2, *fout;
f1 = fopen("transcript1.bin", "rb");
f2 = fopen("transcript2.bin", "rb");
fout = fopen("results.bin", "w+b"); // w+ for write and read
if (f1 == NULL || f2 == NULL || fout == NULL) {
printf("Error opening one of the files. Exiting...\n");
exit(-1);
}
// Read both files and write the higher grade to the output file
while (fread(&rec1, sizeof(ExamRecord), 1, f1)) {
fread(&rec2, sizeof(ExamRecord), 1, f2);
if (strcmp(rec1.course, rec2.course) == 0) { // Verify courses match
if (rec1.grade < rec2.grade) {
rec1.grade = rec2.grade; // Take the higher grade
}
fwrite(&rec1, sizeof(ExamRecord), 1, fout);
}
}
fclose(f1);
fclose(f2);
// Rewind the output file to read from the beginning
rewind(fout);
// Read and print the results file to verify
printf("Contents of results.bin:\n");
printf("------------------------\n");
printf("%-20s | %s\n", "Course", "Grade");
printf("------------------------\n");
while (fread(&rec1, sizeof(ExamRecord), 1, fout)) {
printf("%-20s | %d\n", rec1.course, rec1.grade);
}
fclose(fout);
return 0;
}
Exercise 4: Alien communication decoder
NASA operators have intercepted strange signals from an alien civilization! The signals consist of sequences of 'g' and 'G' characters. A secret file has been discovered that contains the key to decode these messages.
Part 1: Decoding alien messages
Create a program that:
- Reads a translation key from
correspondence.txt
where:- Each line contains a Latin letter, a space, and a 3-character sequence of 'g' and 'G'
- Example:
H GgG
- Reads alien messages from
messages.txt
where:- Each message consists of sequences of 'g' and 'G' characters grouped in threes
- Decodes each message and displays the translated text
- Uses appropriate data structures to organize the translation data
- Handles file operations safely with error checking
Part 2: Encoding messages for aliens
Extend your program to:
- Allow users to input messages in Latin characters
- Encode these messages into the alien 'g'/'G' format
- Display the encoded message
- Design your solution with appropriate functions for modularity
Show Solution for Exercise 4
#include <stdio.h>
#include <string.h>
typedef struct {
char letter[2]; // Latin letter + null terminator
char code[4]; // 3 character alien code + null terminator
} TranslationPair;
int main() {
int table_size = 0;
int i = 0;
int j;
char current_triplet[4]; // Buffer for reading triplets (3 chars + null terminator)
char message[50]; // Buffer for reading message lines
FILE *fp; // File pointer
TranslationPair translations[26]; // Array to store letter-code pairs (max 26 letters)
// Open correspondence.txt file
fp = fopen("correspondence.txt", "rt");
// Error handling for file opening
if (fp == NULL) {
printf("Cannot open correspondence.txt file\n");
return -1;
}
// Read translation table from file
while (fscanf(fp, "%s %s", translations[table_size].letter, translations[table_size].code) > 0) {
printf("%c %s\n", translations[table_size].letter[0], translations[table_size].code);
table_size++;
}
fclose(fp);
// Open messages.txt file
fp = fopen("messages.txt", "rt");
if (fp == NULL) {
printf("Cannot open messages.txt file\n");
return -1;
}
// Process each line from messages.txt
printf("\nDecoded messages:\n");
printf("----------------\n");
while (fscanf(fp, "%s", message) > 0) {
i = 0;
printf("Encoded: %s\n", message);
printf("Decoded: ");
// Process each triplet in the message
while (message[i] != '\0') {
// Extract a triplet
current_triplet[0] = message[i];
current_triplet[1] = message[i+1];
current_triplet[2] = message[i+2];
current_triplet[3] = '\0';
// Look up the triplet in our translation table
for (j = 0; j < table_size; j++) {
if (strcmp(current_triplet, translations[j].code) == 0) {
printf("%c", translations[j].letter[0]);
i += 3; // Move to the next triplet
break;
}
}
// If no match found, we should have some error handling here
if (j == table_size) {
i++; // Move forward if no match (prevents infinite loop)
}
}
printf("\n\n");
}
fclose(fp);
// Part 2: Encode a message from the user
printf("Enter a message to encode: ");
scanf("%s", message);
// Encode the message
printf("Encoded: ");
i = 0;
while (message[i] != '\0') {
for (j = 0; j < table_size; j++) {
if (translations[j].letter[0] == message[i]) {
printf("%s", translations[j].code);
break;
}
}
i++;
}
printf("\n");
return 0;
}