When I start learning new programming languages, I often consider how to create models for specific domains. I don’t know whether my consulting background influences this, but it always comes to mind. I’m sharing my initial attempts at using Rust, following a functional domain modeling approach, so there may be room for improvement in some of the implementations in this blog post.
I plan to talk about functional domain modeling in two separate articles. In this first part, I will focus on more basic concepts in Rust, whereas in this second part, I will consider the Rust memory management system and how it impacts the design of our domain models.
Domain modeling, influenced by functional programming principles, aims to represent the business domain in the code accurately. Rust is ideal thanks to its language features and type system, which enforce correctness and reduce the likelihood of bugs. By modeling the domain accurately, we aim to use the Rust compiler to catch errors early and prevent them from propagating to runtime. This will help us to reduce the need for extensive unit testing and improve the reliability and maintainability of the codebase.
Let’s emphasize some of the principles, types, and techniques brought by Functional Programming:
- Algebraic Data Types (ADTs), where Rust’s
enum
andstruct
types can be used to define ADTs. - Pure functions, encouraging the usage of functions that have no side effects and always return the same output given the same input. Pure functions can help ensure that the domain model is correct and predictable.
- The
Result
andOption
types represent errors and optional values, respectively, allowing validating models to ensure that the model is consistent, complete, and satisfies any invariants or constraints required by the business domain. - With
Traits
, we can create a more expressive and flexible domain model by defining traits corresponding to domain concepts. - Finally, and related to the previous point, given the importance of avoiding mutable states and side effects, Rust enforces this principle through its ownership and borrowing system, which ensures that memory is managed safely and efficiently. Smart pointers, such as
Box
,Rc
, andArc
, are also relevant for this purpose because they allow writing more functional code.
We will review the final topic in the next blog post. Today, we will examine the remaining bullet points using a Hiring Pipeline with its candidates as an example. Thus, we will implement the domain logic, define relationships between entities, validate models, and capture behaviors.
Algebraic Data Types (ADTs)
In Rust, we can use ADTs to model our application’s domain entities and relationships in a functional way, clearly defining the set of possible values and states. Rust has two main types of ADTs: enum
and struct
. enum
is used to define a type that can take on one of several possible variants, while struct
is used here to express a type that has named fields.
Let’s start working on the example we mentioned above:
struct Candidate {
id: u64,
name: String,
email: String,
experience_level: String,
interview_status: String,
application_status: String,
}
The Candidate
struct represents a candidate in a hiring pipeline with a unique ID, a name, an email address, an experience level, and application and interview statuses. Indeed, this is quite simple since our model uses basic types (unsigned integers and Strings) where we cannot add "restrictions" about the different values that every field can take.
let candidate = Candidate {
id: 1,
name: String::from("jane.brown@example.com"),
email: String::from("Jane Brown"),
experience_level: String::from("Senior"),
interview_status: String::from("Scheduled"),
application_status: String::from("In Review"),
};
Well, the compiler is not smart enough to detect that I’ve mixed name
and email
.
How could we fix this? newtype
The newtype pattern is typical in functional programming. In Haskell, this pattern is supported via the newtype
declaration, which allows the programmer to define a new type identical to an existing one except for its name. This is useful for creating type-safe abstractions, enabling the programmer to enforce stronger type constraints on using specific values.
Similarly, in Rust, the newtype idiom brings compile-time guarantees that the correct value type is supplied. The newtype is a struct that wraps a single value and provides a new type for that value. A newtype is the same as the underlying type at runtime, so it will not introduce any performance overhead. Indeed, the code generated will be as efficient as if the underlying type was used directly, given that the Rust compiler will eliminate the newtype at compile-time.
This is exactly what we need to improve our model:
struct CandidateId(u64);
struct CandidateName(String);
struct CandidateEmail(String);
struct CandidateExperienceLevel(String);
struct CandidateInterviewStatus(String);
struct CandidateApplicationStatus(String);
struct Candidate {
id: CandidateId,
name: CandidateName,
email: CandidateEmail,
experience_level: CandidateExperienceLevel,
interview_status: CandidateInterviewStatus,
application_status: CandidateApplicationStatus,
}
So the compiler will report an error if the values are mixed. The following Candidate
instance would look better:
let candidate = Candidate {
id: CandidateId(1),
name: CandidateName(String::from("Jane Brown")),
email: CandidateEmail(String::from("jane.brown@example.com")),
experience_level: CandidateExperienceLevel(String::from("Senior")),
interview_status: CandidateInterviewStatus(String::from("Scheduled")),
application_status: CandidateApplicationStatus(String::from("In Review")),
};
Going deeper into ADTs
In functional programming, ADTs are a way to represent structured data using product types and sum types.
- A product type is created by combining two or more data types into a new type (see the
Candidate
struct type above). In addition tostruct
, tuples are also product types in Rust. - However, sum types, also known as enums or tagged unions, represent data that can take on one of several possible values. In Rust, sum types are defined using the
enum
keyword.
Following our example domain, we could add the following enum
types:
enum ExperienceLevel {
Junior,
MidLevel,
Senior,
}
enum InterviewStatus {
Scheduled,
Completed,
Cancelled,
}
enum ApplicationStatus {
Submitted,
UnderReview,
Rejected,
Hired,
}
This way, we can "compose" these new sum types with our existing Candidate
model. In other words, we are starting to see how this data composition between different types supports creating more complex types that accurately represent the data we are working with.
struct Candidate {
id: CandidateId,
name: CandidateName,
email: CandidateEmail,
experience_level: ExperienceLevel,
interview_status: InterviewStatus,
application_status: ApplicationStatus,
}
let candidate = Candidate {
id: CandidateId(1),
name: CandidateName(String::from("Jane Brown")),
email: CandidateEmail(String::from("jane.brown@example.com")),
experience_level: ExperienceLevel::Senior,
interview_status: InterviewStatus::Scheduled,
application_status: ApplicationStatus::UnderReview,
};
The ExperienceLevel
enum represents a candidate’s possible levels of experience, while the InterviewStatus
and ApplicationStatus
enums represent the possible states of an interview and an application, respectively.
Pure Functions in Rust
Pure Functions applies to every functional language, where we should avoid side effects and the mutable state as much as possible. Accordingly, in Rust, we can make pure functions. For example, we could add new
associated functions for CandidateId
or CandidateName
:
struct CandidateId(u64);
impl CandidateId {
fn new(id: u64) -> Self {
CandidateId(id)
}
}
struct CandidateName(String);
impl CandidateName {
fn new(name: String) -> Self {
CandidateName(name)
}
}
These functions are pure as they do not have any side effects and only return the created object (Self
in the context).
Data Validation with Result
and Option
types
What would be appropriate if we want to model that a candidate may still need to be scheduled for an interview? The Option
type could be a good choice here.
struct Candidate {
id: CandidateId,
name: CandidateName,
email: CandidateEmail,
experience_level: ExperienceLevel,
interview_status: Option<InterviewStatus>,
application_status: ApplicationStatus,
}
let candidate = Candidate {
id: CandidateId(1),
name: CandidateName(String::from("Jane Brown")),
email: CandidateEmail(String::from("jane.brown@example.com")),
experience_level: ExperienceLevel::Senior,
interview_status: None, // no status yet
application_status: ApplicationStatus::UnderReview,
};
We might have an additional requirement; we need to validate the candidate’s email address. We can add validation logic to the CandidateEmail
type constructor to ensure the email address is valid. We will use the Result
type, which represents the possibility of an operation failing or succeeding, which is ideal for validation purposes like the case presented here. Here’s a potential implementation (quite simple, by the way):
impl CandidateEmail {
fn new(email: String) -> Result<Self, String> {
if email.contains('@') {
Ok(CandidateEmail(email))
} else {
Err(String::from("Invalid email address"))
}
}
}
The function first checks whether the email address contains the '@'
symbol using the contains
method of the String
type. Of course, this is far from an excellent pattern to validate an email, but I wanted to showcase a simplified version for didactic purposes. So if the email
address is invalid, the method returns an Err
variant containing an error message as a String
. On the other hand, if the email address is valid, the method creates a new CandidateEmail
instance on the Ok
variant.
Alternatively, Rust provides the From
and TryFrom
traits that are useful for converting between types:
// From<T> definition
pub trait From<T> {
fn from(T) -> Self;
}
// TryFrom<T> definition
pub trait TryFrom<T>: Sized {
type Error;
fn try_from(T) -> Result<Self, Self::Error>;
}
Notably, the TryFrom
trait is handy if the conversion between types may fail, like in the CandidateEmail
case.
use std::convert::TryFrom;
struct CandidateEmail(String);
impl TryFrom<String> for CandidateEmail {
type Error = String;
fn try_from(email: String) -> Result<Self, Self::Error> {
if email.contains('@') {
Ok(CandidateEmail(email))
} else {
Err(String::from("Invalid email address"))
}
}
}
In this situation, we could handle the Candidate
creation as well, where again, data and validations like this could be stacked up.
struct Candidate {
id: CandidateId,
name: CandidateName,
email: CandidateEmail,
experience_level: ExperienceLevel,
interview_status: Option<InterviewStatus>,
application_status: ApplicationStatus,
}
impl Candidate {
fn new(
id: CandidateId,
name: CandidateName,
email: String,
experience_level: ExperienceLevel,
interview_status: Option<InterviewStatus>,
application_status: ApplicationStatus,
) -> Result<Self, String> {
let candidate_email = CandidateEmail::try_from(email)?;
Ok(Candidate {
id,
name,
email: candidate_email,
experience_level,
interview_status,
application_status,
})
}
}
Again, we could have applied the same pattern for the Candidate
type, but I am leaving this for home practice.
With the implementation given above, we can create a new Candidate
instance like this:
let candidate: Result<Candidate, String> =
Candidate::new(
CandidateId(2),
CandidateName(String::from("John Doe")),
String::from("johndoe@example.com"), // passing directly the String
ExperienceLevel::Junior,
Some(InterviewStatus::Scheduled),
ApplicationStatus::UnderReview,
);
Hence, if the email address is valid, we construct a new CandidateEmail
instance and include it in the Candidate
object. However, if CandidateEmail::new
/CandidateEmail::try_from
returns an Err
value, the ?
operator will return that error to the caller of Candidate::new
.
Expressing Behaviors with Traits
Let’s imagine we introduce a new behavior or requirement to manage the candidates’ interviews, where hiring managers and recruiters could schedule interviews. This way, we could define the behavior in the following fashion:
trait Interviewer {
fn schedule_interview(&self, candidate: &Candidate) -> Result<InterviewStatus, String>;
}
Then, we could make the HiringManager
and Recruiter
entities to implement this trait:
struct HiringManager {
name: String,
}
impl Interviewer for HiringManager {
fn schedule_interview(&self, candidate: &Candidate) -> Result<InterviewStatus, String> {
// Biz and validation logic to schedule an interview with given candidate
// ...
Ok(InterviewStatus::Scheduled)
}
}
struct Recruiter {
name: String,
}
impl Interviewer for Recruiter {
fn schedule_interview(&self, candidate: &Candidate) -> Result<InterviewStatus, String> {
// Biz and validation logic to schedule an interview with given candidate
// ...
Ok(InterviewStatus::Scheduled)
}
}
Using this approach, we could write functions that take an argument of type impl Interviewer
and call the schedule_interview
function on it without worrying about the concrete type of the argument.
fn schedule_interview<I: Interviewer>(
interviewer: &I,
candidate: &Candidate,
) -> Result<InterviewStatus, String> {
interviewer.schedule_interview(candidate)
}
let candidate: Candidate = unimplemented!();
let interviewer: HiringManager = unimplemented!();
match schedule_interview(&interviewer, &candidate) {
Ok(status) => println!("Interview status: {:?}", status),
Err(e) => println!("The interview scheduling failed: {:?}", e),
};
As you can see, we have defined the schedule_interview
function to take any type I
that implements the Interviewer
trait. This allows us to pass in any concrete type that implements the trait without worrying about the concrete type itself. Then, to complete the example, the match
statement calls the schedule_interview
function with the interviewer
and candidate
variables as arguments, and then it pattern matches on the resulting Result
. If the result is Ok
, it prints the interview status; otherwise, it prints the error message.
Summary
In this post, we have explored how Algebraic Data Types, pure functions, the Result
and Option
types, and traits are powerful concepts that can help to design domain models using functional programming in Rust. ADTs can model domain concepts and provide type safety, while pure functions ensure referential transparency and make it easier to reason about the program. The Result
and Option
types allow for more expressive error handling, and traits can abstract over concrete types and promote flexibility in the program design. Using these concepts, Rust developers can create robust, scalable, and maintainable domain models that are easier to test and extend over time.
In the next article, I will examine how Smart Pointers like Box
, Rc
, and Arc
can further enhance our functional domain modeling in Rust, allowing us to manage memory efficiently and share data between multiple parts of our program.