Factory Method#
Definition#
The Factory Method is a creational design pattern that provides an interface for creating objects in a superclass but allows subclasses to alter the type of objects that will be created. Essentially, it delegates the instantiation process to subclasses, promoting loose coupling and adherence to the Open/Closed Principle—software entities should be open for extension but closed for modification.
Why “Factory Method”?#
The term “Factory” signifies a component responsible for creating objects, much like a manufacturing factory produces goods. The “Method” aspect refers to the pattern’s use of methods to encapsulate the creation logic. Therefore, the Factory Method encapsulates object creation within methods, enabling flexibility and scalability in how objects are instantiated and managed.
Here relate to the named constructor is also loosely a factory method.
Factory Method in the RAG System#
RAG System Components#
In the RAG system, several core components interact to process queries:
Chunkers: Divide input text into manageable chunks.
Embedders: Transform text chunks into numerical embeddings.
Retrievers: Fetch relevant documents based on embeddings.
Each of these components can have multiple implementations (e.g.,
SimpleChunker
, AdvancedChunker
), each with distinct behaviors and
performance characteristics.
Implementing the Factory Method#
In our implementation, the Factory Method pattern is embodied by the
ComponentFactory
class. This factory is responsible for creating instances of
Chunker
, Embedder
, and Retriever
based on specified identifiers.
from abc import ABC, abstractmethod
from typing import List
class Chunker(ABC):
@abstractmethod
def chunk(self, text: str) -> List[str]:
...
class Embedder(ABC):
@abstractmethod
def embed(self, chunks: List[str]) -> List[List[float]]:
...
class Retriever(ABC):
@abstractmethod
def retrieve(self, query_embedding: List[float]) -> List[str]:
...
import hashlib
import re
from typing import List
from omnixamples.software_engineering.design_patterns.factory_method.abstract_methods import (
Chunker,
Embedder,
Retriever,
)
class SimpleChunker(Chunker):
def chunk(self, text: str) -> List[str]:
return text.split(".")
class AdvancedChunker(Chunker):
def chunk(self, text: str) -> List[str]:
return re.split(r"(?<=[.!?]) +", text)
class SimpleEmbedder(Embedder):
def embed(self, chunks: List[str]) -> List[List[float]]:
embeddings = []
for chunk in chunks:
embedding = [float(ord(c)) / 1000.0 for c in chunk if c.isalpha()]
embeddings.append(embedding[:10]) # Limit to first 10 values for simplicity
return embeddings
class AdvancedEmbedder(Embedder):
def embed(self, chunks: List[str]) -> List[List[float]]:
# Placeholder for advanced embedding logic, e.g., using transformers
return [[0.1 * i for i in range(10)] for _ in chunks]
class SimpleRetriever(Retriever):
def retrieve(self, query_embedding: List[float]) -> List[str]:
hash_val = int(hashlib.sha256(str(query_embedding).encode("utf-8")).hexdigest(), 16)
return [f"Document_{hash_val % 100}" for _ in range(3)]
class AdvancedRetriever(Retriever):
def retrieve(self, query_embedding: List[float]) -> List[str]:
# Placeholder for advanced retrieval logic, e.g., semantic search
hash_val = int(hashlib.sha256(str(query_embedding).encode("utf-8")).hexdigest(), 16)
return [f"AdvancedDoc_{hash_val % 100}" for _ in range(3)]
from __future__ import annotations
from abc import ABC
from typing import Callable, Dict, Type, TypeVar
from omnixamples.software_engineering.design_patterns.factory_method.abstract_methods import (
Chunker,
Embedder,
Retriever,
)
from omnixamples.software_engineering.design_patterns.factory_method.concrete_methods import (
AdvancedChunker,
AdvancedEmbedder,
AdvancedRetriever,
SimpleChunker,
SimpleEmbedder,
SimpleRetriever,
)
C = TypeVar("C", bound=ABC)
class ComponentFactory:
"""Factory class for creating components.
{
'chunker': {
'simple': <function create_simple_chunker at 0x102258670>,
'advanced': <function create_advanced_chunker at 0x1023d53a0>
},
'embedder': {
'simple': <function create_simple_embedder at 0x1023d5700>,
'advanced': <function create_advanced_embedder at 0x1023d58b0>
},
'retriever': {'simple': <function create_simple_retriever at 0x1023d5b80>}
}
"""
_creators: Dict[str, Dict[str, Callable[[], C]]] = {}
@classmethod
def register(
cls: Type[ComponentFactory], category: str, identifier: str
) -> Callable[[Callable[[], C]], Callable[[], C]]:
def decorator(creator: Callable[[], C]) -> Callable[[], C]:
if category not in cls._creators:
cls._creators[category] = {}
cls._creators[category][identifier] = creator
return creator
return decorator
@classmethod
def get_component(cls: Type[ComponentFactory], category: str, identifier: str) -> C:
try:
creator = cls._creators[category][identifier]
return creator()
except KeyError as exc:
available = cls._creators.get(category, {})
raise ValueError(
f"Unknown {category} type: '{identifier}'. Available types: {list(available.keys())}"
) from exc
# Register Chunkers
@ComponentFactory.register(category="chunker", identifier="simple")
def create_simple_chunker() -> Chunker:
return SimpleChunker()
@ComponentFactory.register(category="chunker", identifier="advanced")
def create_advanced_chunker() -> Chunker:
return AdvancedChunker()
# Register Embedders
@ComponentFactory.register(category="embedder", identifier="simple")
def create_simple_embedder() -> Embedder:
return SimpleEmbedder()
@ComponentFactory.register(category="embedder", identifier="advanced")
def create_advanced_embedder() -> Embedder:
return AdvancedEmbedder()
# Register Retrievers
@ComponentFactory.register(category="retriever", identifier="simple")
def create_simple_retriever() -> Retriever:
return SimpleRetriever()
@ComponentFactory.register(category="retriever", identifier="advanced")
def create_advanced_retriever() -> Retriever:
return AdvancedRetriever()
from __future__ import annotations
from typing import List
from pydantic import BaseModel
from omnixamples.software_engineering.design_patterns.factory_method.abstract_methods import (
Chunker,
Embedder,
Retriever,
)
from omnixamples.software_engineering.design_patterns.factory_method.registry import ComponentFactory
class RAGSystem(BaseModel):
chunker: Chunker
embedder: Embedder
retriever: Retriever
@staticmethod
def create_system(chunker_type: str, embedder_type: str, retriever_type: str) -> RAGSystem:
chunker = ComponentFactory.get_component("chunker", chunker_type)
embedder = ComponentFactory.get_component("embedder", embedder_type)
retriever = ComponentFactory.get_component("retriever", retriever_type)
return RAGSystem(chunker=chunker, embedder=embedder, retriever=retriever)
def process_query(self, query: str) -> List[str]:
chunks = self.chunker.chunk(query)
embeddings = self.embedder.embed(chunks)
if not embeddings:
return []
query_embedding = embeddings[0]
retrieved_docs = self.retriever.retrieve(query_embedding)
return retrieved_docs
class Config:
arbitrary_types_allowed = True
from omnixamples.software_engineering.design_patterns.factory_method.rag import RAGSystem
def main() -> None:
# Configuration for the RAG system
configurations = [
{"chunker_type": "simple", "embedder_type": "simple", "retriever_type": "simple"},
{"chunker_type": "advanced", "embedder_type": "advanced", "retriever_type": "advanced"},
{"chunker_type": "advanced", "embedder_type": "simple", "retriever_type": "advanced"},
]
# Example query
query = (
"Implementing design patterns can greatly improve software architecture. "
"Factory Method is one such pattern. "
"It provides a way to delegate the instantiation logic to subclasses."
)
for idx, config in enumerate(configurations, start=1):
print(f"\n--- RAG System Configuration {idx} ---")
try:
rag_system = RAGSystem.create_system(
chunker_type=config["chunker_type"],
embedder_type=config["embedder_type"],
retriever_type=config["retriever_type"],
)
retrieved_documents = rag_system.process_query(query)
print("Retrieved Documents:")
for doc in retrieved_documents:
print(f"- {doc}")
except ValueError as e:
print(f"Error: {e}")
if __name__ == "__main__":
main()
What Are The Key Elements?#
Abstract Base Classes (ABCs): Define the interfaces (
Chunker
,Embedder
,Retriever
) that all concrete implementations must follow.Concrete Implementations: Provide specific behaviors for each component (e.g.,
SimpleChunker
,AdvancedEmbedder
).ComponentFactory: A generic factory that uses a registry to map component categories and identifiers to their respective creation methods.
Registration Decorators: Allow components to register themselves with the factory, specifying their category and identifier.
RAGSystem Assembly: Utilizes the factory to instantiate required components based on configuration.
How It Works?#
Registration: Each concrete component class registers itself with the
ComponentFactory
using a decorator, specifying its category (chunker
,embedder
,retriever
) and a unique identifier (simple
,advanced
).@ComponentFactory.register(category='chunker', identifier='simple') def create_simple_chunker() -> Chunker: return SimpleChunker()
Instantiation: When assembling the
RAGSystem
, the factory’sget_component
method is invoked with the desired category and identifier to obtain an instance of the required component.rag_system = RAGSystem.create_system( chunker_type='simple', embedder_type='advanced', retriever_type='simple' )
Usage: The
RAGSystem
uses the instantiated components to process queries seamlessly, without needing to know the specifics of each component’s implementation.
Why Use the Factory Method Pattern?#
a. Encapsulation of Object Creation#
By centralizing the instantiation logic within the factory, the system decouples object creation from its usage. This means that the rest of the system doesn’t need to know the details of how components are created, fostering a separation of concerns.
b. Adherence to the Open/Closed Principle#
The Factory Method allows the system to be open for extension but closed
for modification. New component types can be added without altering existing
factory logic or the RAGSystem
assembly code. For example, introducing an
AdvancedRetriever
only involves creating the class and registering it with the
factory.
c. Promoting Loose Coupling#
Components are not tightly bound to their implementations. The RAGSystem
interacts with components through their abstract interfaces (Chunker
,
Embedder
, Retriever
), allowing for interchangeable implementations without
impacting the system’s core logic.
d. Enhancing Scalability and Maintainability#
As the system grows, managing component instantiations becomes more straightforward. Adding, removing, or modifying components doesn’t require widespread changes across the codebase. The factory serves as a single point of management, simplifying maintenance and scalability.
e. Facilitating Testing and Mocking#
With a centralized factory, testing becomes more manageable. Mock components can be registered with the factory during tests, allowing for controlled and predictable testing environments without altering the system’s core logic.
We Adhere To Good Design Principles#
a. Single Responsibility Principle (SRP)#
Each class has a single responsibility:
Components (Chunkers, Embedders, Retrievers): Handle their specific processing logic.
ComponentFactory: Manages the creation of components.
RAGSystem: Orchestrates the interaction between components to process queries.
This segregation ensures that changes in one area (e.g., adding a new
Embedder
) don’t ripple across unrelated parts of the system.
b. Dependency Inversion Principle (DIP)#
High-level modules (RAGSystem
) depend on abstractions (Chunker
, Embedder
,
Retriever
), not on concrete implementations. The factory provides these
abstractions, allowing high-level modules to remain agnostic of low-level
details.
c. Open/Closed Principle (OCP)#
The system is designed to accommodate new components without modifying existing code. This is achieved through the factory’s registration mechanism, where new components can be introduced by simply registering them, leaving existing code untouched.
d. Interface Segregation Principle (ISP)#
Components adhere to their respective interfaces, ensuring that they only implement the methods relevant to their roles. This prevents bloated interfaces and promotes cleaner, more focused component designs.
e. Liskov Substitution Principle (LSP)#
Subtypes (e.g., AdvancedChunker
) can replace their base types (Chunker
)
without altering the correctness of the program. The factory ensures that
components instantiated via the factory conform to their abstract interfaces,
maintaining substitutability.